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Zusammenfassung 


Das mechanische Verhalten vieler angewandter Materialien wird ent- 
scheidend durch ihre Mikrostruktur beeinflusst. Zuverlassige rechnerge- 
stützte Homogenisierungsverfahren sind daher unverzichtbar um die 
Entwicklung und industrielle Anwendung neuer Materialklassen vor- 
anzutreiben. Ein Beispiel stellen gerichtet erstarrte NiAl Eutektika dar, 
welche Faser- oder Laminatstrukturen aufweisen und aufgrund ihrer 
hohen Schmelztemperatur und ihres Leichtbaupotentials von hohem 
Forschungsinteresse sind. Diese Materialklasse stellt moderne Mikro- 
mechaniklöser vor große Herausforderungen, wie z.B. Mikrostruktur- 
merkmale auf verschiedenen Längenskalen, ein hoher Materialkontrast 
bezüglich der Kriecheigenschaften und anspruchsvolle nichtlineare Ma- 
terialmodelle. 


Das Ziel dieser Arbeit ist daher die Untersuchung und Entwicklung 
effizienter FFT-basierter Mikromechaniklöser zur Berechnung des (ther- 
mo)mechanischen Effektivverhaltens nichtlinearer Komposite mit kom- 
plexer Mikrostruktur. Sowohl Lippmann-Schwinger Löser als auch Pola- 
risationsmethoden dienen hierbei als Startpunkt für weiterentwickelte 
Lösungsverfahren. Insbesondere nutzen wir das mächtige BFGS Quasi- 
Newton Verfahren im Rahmen der FFT-basierten Mikromechanik um 
schnelle, tangentenfreie Algorithmen zu entwickeln. Des Weiteren ver- 
bessern wir über die Fixpunktbeschleunigung nach Anderson das Kon- 
vergenzverhalten von Polarisationsmethoden bei unendlichem Material- 
kontrast. 


Zusätzlich zu universell einsetzbaren Verfahren werden einige spezielle 
Anwendungen von FFT-basierten Lösern betrachtet. Zum einen wird die 
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spannungsexplizite Formulierung der Fließregel von Kristallplastizitäts- 
modellen bei kleinen Deformationen ausgenutzt, um die Rechenzeit von 
FFT-Lösern um ca. eine Größenordnung zu reduzieren. Hierbei dient 
die Spannung als primäre Feldgröße des periodischen Zellproblems in 
dualer Form. Zweitens werden thermomechanisch gekoppelte Probleme 
im Rahmen der asymptotischen Homogenisierung betrachtet. Dabei 
wird die Entkopplung von Mechanik und Wärmeleitung auf der Mi- 
kroebene genutzt um einen impliziten Löser zu entwickeln, welcher zu 
allen Dehnungsbasierten FFT-Methoden kompatibel ist. 

Zum Schluss kehren wir zu den gerichtet erstarrten NiAl-Mo Legierun- 
gen zurück und nutzen die entwickelten Verfahren für eine detaillierte 
Studie ihres Kriechverhaltens. Der Fokus liegt hierbei auf auf zellulären 
NiAl-Mo Eutektika, deren Verhalten aufgrund ihrer Multiskalenstruktur 
bisher nicht im Rahmen von Mikromechaniksimulationen untersucht 
wurde. Zu diesem Zweck wird ein phänomenologisches Ersatzmodell 
entwickelt, welches das Kriechverhalten ausgerichteter NiAl-Mo Fa- 
serstrukturen abbildet. Die zellulären Mikrostrukturen werden über 
einen Level-Set Ansatz generiert. Mithilfe der Kriechsimulationen kann 
die Unterscheidung zwischen harter und weicher Phase im Zellgefüge 
geklärt werden. Des Weiteren wird den Einfluss von Zellanteil und 
Aspektverhältnis auf das Kriechverhalten analysiert. 
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Summary 


The mechanical behavior of many applied materials arises from their mi- 
crostructure. Thus, to aid the design, development and industrialization 
of new materials, robust computational homogenization methods are 
indispensable. For instance, NiAl-based directionally solidified eutectics, 
with fibrous or lamellar microstructure, constitute a material class of 
high research interest, due to their high temperature resistance and light- 
weight potential. With structural features on different length scales, a 
high contrast of mechanical properties during creep and computationally 
demanding material models, these materials exemplify the challenges 
for modern micromechanics solvers. 

Hence, the present thesis is devoted to investigating and develop- 
ing FFT-based micromechanics solvers for efficiently computing the 
(thermo)mechanical response of nonlinear composite materials with 
complex microstructures. To this end, both Lippmann-Schwinger 
solvers and polarization schemes are considered as starting points for 
new general-purpose methods. More precisely, we investigate two novel 
applications of the powerful BFGS Quasi-Newton method in the context 
of FFT-based micromechanics, to produce fast, tangent-free algorithms. 
Moreover, we use Anderson acceleration to eliminate the main weakness 
of polarization schemes, i.e., their inability to handle materials with 
infinite contrast. 

In addition to powerful general-purpose methods, we consider a num- 
ber of specialized applications of FFT-based solvers. Firstly, noting 
that the flow rule of small-strain crystal-plasticity models is naturally 
formulated as a function of the stress, we revisit the dual variational 
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Summary 


framework for the periodic cell problem. Using modern FFT-methods 
in the stress-based setting, computation times for polycrystalline ma- 
terials are reduced by about an order of magnitude. Secondly, we 
consider thermomechanically coupled materials using the framework of 
asymptotic homogenization. Based on the decoupling of mechanics and 
heat conduction on the microscale, we propose an implicit staggered 
approach, which is compatible to all strain-based FFT-methods. 

Last but not least, we return to directionally solidified NiAl-Mo eutectics 
and use the developed solvers to thoroughly investigate their creep be- 
havior. More precisely, we focus on the case of cellular NiAl-Mo, which 
has not been subjected to a simulation study, owing to its multiscale 
microstructure. To tackle this problem, we propose a phenomenological 
surrogate model for the creep behavior of well-aligned fibrous NiAl-Mo 
and generate the cellular mesostructures based on a level-set approach. 
By micromechanical simulations, we are able to clarify the distinction 
between soft and hard regions and identify the impact of cell volume 
fraction and aspect ratio. 
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Chapter 1 


Introduction 


1.1 Motivation and objectives 


The effective macroscopic behavior of heterogeneous materials emerges 
from the interplay between microstructure and constituent behavior (Mc- 
Dowell, 2008). Indeed, in modern alloys, the microstructure is tailored 
to fit the desired application. An example for a material class, where 
the microstructure is designed to enhance the effective properties for 
structural applications at high temperatures, are nickel-aluminum (NiAl) 
based directionally solidified eutectics. In these alloys, binary B2-ordered 
NiAl, with low mass density and a high melting point, is combined with 
refractory metals, typically chrome (Cr) and/or molybdenum (Mo), 
providing creep resistance at high temperatures. Depending on chemical 
composition (Cline and Walter, 1970; Cline et al., 1971; Gombola et al., 
2020), a directional solidification process may result in either fibrous or 
lamellar structures, aligned in growth directions. Experimental results 
show that, for a fixed stress loading, the creep rate of the reinforced 
alloys is up to several orders of magnitudes lower, compared to binary 
NiAl. However, for fine-tuning the processing parameters and the 
resulting microstructure, the impact of the morphology on the creep 
resistance of the material has to be determined. For instance, considering 
fibrous NiAl-Mo eutectics, the fiber diameter (Albiez et al., 2016a), fiber 
aspect ratio (Haenschke et al., 2010; Hu et al., 2013) and the presence 


1 Introduction 


of colonies (Misra et al., 1998; Bogner et al., 2012; Seemüller et al., 
2013) all depend on the processing conditions and influence the creep 
behavior of the material. However, relying solely on experiments for 
characterizing the interplay between microstructure and mechanical 
behavior proves to be difficult. Firstly, considerable effort is associated 
with creep experiments, where a single run may take several days 
(Hu et al., 2013) excluding sample preparation. Secondly, deliberately 
manipulating the morphology is difficult, due to the sensitivity with 
respect to the processing parameters. 


Thus, efficient computational homogenization methods are crucial for 
informing the material design process by robustly predicting the material 
behavior. For this purpose, FFT-based solvers (Moulinec and Suquet, 
1994; 1998) have established themselves as powerful tools, compatible 
to either real or synthetic microstructure images. In this context, alloys 
of the NiAl-(Cr, Mo) system prove to be challenging, as microstructural 
features may vastly differ in their characteristic length scale. For instance, 
high fiber aspect ratios (Haenschke et al., 2010; Hu et al., 2013) or 
cellular mesostructures (Misra et al., 1998; Seemiiller et al., 2013) may 
necessitate a fine spatial discretization, leading to representative volume 
elements (Kanit et al., 2003) with a large number of degrees-of-freedom. 
Moreover, the crystal plasticity models, governing the constituent be- 
havior on the microscale (Albiez et al., 2016a), are associated with 
significant computational costs (Eghtesad et al., 2018a). This motivates 
us to develop and investigate highly-efficient FFT-based solvers for 
enabling the micromechanical study of materials with complex geometry 
and nonlinear material behavior. In addition to specialized methods 
for crystal plasticity models, we are interested in powerful general 
purpose solvers, which are applicable for a wide range of applied 
materials. Hence, next to the NiAl-based eutectics, serving as our 
primary motivation and guiding application, other material classes, such 


1.1 Motivation and objectives 


as fiber reinforced polymers, are included as computational benchmarks. 


In the following, we give a breakdown of our primary objectives: 


Interpreting the periodic cell problem in the framework of convex 
optimization (Kabel et al., 2014; Schneider, 2017a; Bellis and Suquet, 
2019), we seek to exploit to modern solvers and acceleration schemes 
in the context of FFT-based micromechanics. More precisely, we intro- 
duce the BFGS update (Broyden, 1970; Fletcher, 1970; Goldfarb, 1970; 
Shanno, 1970) to Lippmann-Schwinger based solvers, to generate fast 
and tangent-free algorithms. Furthermore, we use Anderson mixing 
(Anderson, 1965) to dispose of the cumbersome parameter tuning of 
polarization-based schemes, considerably broadening their range of 
application. 

All developed schemes shall be thoroughly benchmarked against 
state-of-the-art FFT-solvers. To demonstrate the general usefulness of 
our methods, we consider challenging problems of industrial scale 
and complexity with inelastic material behavior and finite as well as 
infinite material contrast. 

For some small-strain crystal plasticity models, evaluating the inverse 
material law, i.e., computing the strain as a function of the stress, is 
computationally more efficient, due to the stress-explicit formulation 
of the flow rule. To exploit this observation, we transfer FFT-based 
solvers to a dual (stress-based) framework (Bhattacharya and Suquet, 
2005). 

Based on the framework of asymptotic homogenization by Chatzige- 
orgiou et al. (2016), we seek to extend the power of FFT-based methods 
to thermomechanically coupled problems. In particular, we exploit 
the homogeneity of the temperature on the microscale to generate a 
flexible, stable and efficient solver. 

With the developed methods, we study the creep behavior of cellular 
NiAl-Mo based on the material model by Albiez et al. (2016a) and 
the experimental data by Seemiiller et al. (2013). For facilitating creep 
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simulations on the cellular mesostructure, we develop a surrogate 
model for NiAl-Mo with well-aligned fibers and use the level-set 
framework by Sonon et al. (2012; 2015) to generate high-fidelity cell 
structures. 


1.2 State of the art 


1.2.1 NiAl-based directionally solidified eutectics 


By basic thermodynamical considerations, the maximum operating 


temperature is one of the main limiting factors for the efficiency of 
gas turbines (Desideri, 2013). Thus, high-strength structural alloys 
with a melting point beyond the limits of state-of-the-art nickel-based 


superalloys are of high interest as potential turbine blade materials. 


In this context, the B2-ordered intermetallic NiAl features a number 


of attractive properties which have led to increased research interest 
(Darolia, 1991; Miracle, 1993; Noebe et al., 1993): 


1. 


Its melting temperature of 1638°C lies roughly 250°C above that of 
nickel-based superalloys. In addition, the thermal conductivity of 
NiAl around 70 — 80 W/cm-k surpasses nickel-based superalloys by 
a factor of 4-8, improving the cooling efficiency. The combination 
of these thermal properties is promising for extending the operating 
limits of turbine engines. 


. With a density of roughly 68/cm?, NiAl is about 30% lighter than 


nickel-based superalloys, reducing the centrifugal stresses in the 
blades. 


. NiAl components develop a protective outer layer of AlaO;, provid- 


ing excellent corrosion resistance. 


However, binary NiAl lacks in fracture toughness at low temperatures 


and suffers from poor creep resistance at high temperatures, preventing 
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its industrial application (Darolia, 1991; Miracle, 1993; Noebe et al., 1993). 
To counteract these weaknesses, the introduction of refractory metals, 
such as Cr or Mo, in combination with a directional solidification process 
(Cline et al., 1971) has emerged as a promising approach. Under these 
processing conditions, the refractory metal forms reinforcing structures 
in the direction of solidification, where the geometry of the inclusions 
depends on the chemical composition. For instance, in the NiAl-(Cr, Mo) 
system, NiAl-Mo and NiAl-Cr eutectics form fiber structures, whereas 
Cr-rich NiAl-Cr(Mo) leads to a lamellar arrangement of the phases (Cline 
and Walter, 1970; Cline et al., 1971; Gombola et al., 2020). Early studies 
on the mechanical characterization NiAl-X eutectics (Johnson et al., 1995; 
Misra et al., 1998; Whittenberger et al., 2001) found that the fracture 
toughness and creep resistance were improved compared to binary NiAl 
but generally not competitive to nickel-based superalloys. 

For NiAl-Mo eutectics, advances in processing technology facilitated 
further improvements of the mechanical properties. More precisely, 
using an optical floating zone furnace, Bei and George (2005) were able 
to produce highly regular and well-aligned Mo-fiber structures. In 
particular, the material was virtually free of defects, such as cell and 
dendrite structures (Misra et al., 1998; Ferrandini et al., 2004), which 
deteriorate the creep resistance of the material (Seemiiller et al., 2013). 
Dedicated studies on the influence of processing parameters on the 
resulting microstructures (Bogner et al., 2012; Hu et al., 2012; Zhang et al., 
2013) identified a high temperature gradient at the solidification front 
combined with a slow growth rate as key for producing well-aligned 
samples. These advances sparked considerable research interest in 
the mechanical properties and underlying mechanisms of well-aligned 
NiAl-Mo eutectics. Zhang et al. (2012) described several strengthening 
mechanism, such as crack bridging and crack trapping, explaining the 
increased fracture toughness of roughly 14 MPa,/m for eutectic NiAl-Mo 
compared to about 8 MPa,/m for binary NiAl. By increasing the Mo 
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content beyond the eutectic composition, the fracture toughness was 
further improved to above 19 MPa,/m. Bei et al. (2008) and Sudharshan 
Phani et al. (2011) found that the single-crystalline Mo-fibers in well- 
aligned NiAl-Mo were virtually dislocation free. This resulted in a high 
yield strength (Bei et al., 2007) and a decrease in the creep-rate by several 
orders of magnitude for a prescribed stress loading (Haenschke et al., 
2010; Dudovä et al., 2011; Hu et al., 2013). Motivated by these experi- 
mental findings, Albiez et al. (2016a) proposed suitable single-crystal 
material models for the NiAl-matrix and the Mo-fibers and studied 
the effective creep behavior of the well-aligned composite through 
crystal plasticity simulations. In particular, the softening behavior 
of the material during creep was elucidated by a dislocation-based 
hardening law, generalizing an earlier model by El-Awady (2015). In 
a subsequent study (Albiez et al., 2019), the material models were 
extended by a non-local gradient-plasticity approach to account for 
the movement and transfer of dislocations. Overall, both experimental 
studies and simulations significantly improved the understanding of the 
creep behavior of well-aligned NiAl-Mo composites. 


1.2.2 FFT-based micromechanics 


FFT-based solvers, pioneered by Moulinec and Suquet (1994; 1998), com- 
bine a number of salient properties which have driven their widespread 
application in modern computational micromechanics. Firstly, they nat- 
urally operate on regular grids, i.e., voxel images. Hence, they directly 
profit from advances in modern three-dimensional imaging techniques 
(Uchic et al., 2007; Cocco et al., 2013; Epting et al., 2012), providing 
high-fidelity digital representations of real-world microstructures. In 
particular, FFT-based methods avoid the meshing step, which may prove 
infeasible considering the diversity and complexity of microstructures 
in modern applied materials (Bargmann et al., 2018). Secondly, based 
on their inherently matrix-free formulation, FFT-based methods permit 
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memory efficient implementations, enabling the study of large volume 
elements with many degrees of freedom. In this context, researchers also 
profit from readily available and highly optimized implementations of 
the FFT (Frigo and Johnson, 2005), boosting the computational efficiency 
of the derived solution schemes (Eisenlohr et al., 2013; El Shawish et al., 
2020). Last but not least, FFI-based methods provide great flexibility 
with regard to the investigated material behavior, as inelastic problems 
were considered from the very beginning (Moulinec and Suquet, 1998). 
Notably, the original basic scheme by Moulinec and Suquet (1994; 1998) 
already featured all of the above advantages. However, the method was 
found to converge slowly for composites with high material contrast, i.e., 
the ratio of maximum and minimum eigenvalue in the (tangent-)stiffness 
field, and failed to converge at all for the case of infinite material contrast, 
e.g., pores and voids. This motivated further research efforts on the 
algorithmic foundations of FFT-based methods, especially in the areas 
of discretizations and solvers. 

Alternative discretizations, such as finite differences (Willot, 2015; Schnei- 
der et al., 2016), finite volumes (Dorn and Schneider, 2019) and finite 
elements (Schneider et al., 2017; Leuschner and Fritzen, 2018), were 
initially introduced to reduce the oscillations associated to the original 
discretization by trigonometric polynomials. More importantly, it was 
realized that the convergence behavior for problems with infinite con- 
trast was not only a matter of solution scheme but depended critically on 
the choice of discretization. Indeed, under some regularity assumptions 
on the underlying microstructure, convergence of the basic scheme (and 
related methods) could be established for finite difference and finite 
element discretizations (Schneider, 2020b). 

Many successful FFT-algorithms build directly upon the basic scheme 
and the associated Lippmann-Schwinger equation. In this context, 
Zeman et al. (2010) introduced Krylov-subspace solvers, displaying 
excellent performance for linear elastic problems. Their application was 
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extended to nonlinear problems by entering Newton-Krylov methods 
(Gélébart and Mondon-Cancel, 2013; Kabel et al., 2014) or in the form 
of nonlinear conjugate gradients (Schneider, 2020a). By exposing the 
basic scheme as a gradient descent method (Kabel et al., 2014) the 
toolbox of modern optimization algorithms (Boyd and Vandenberghe, 
2004; Nocedal and Wright, 1999) was made available to FFT-based 
micromechanics. Momentum-based fast gradient methods (Schneider, 
2017a; Ernesti et al., 2020) were shown to considerably improve upon 
the performance of vanilla gradient descent. As tangent-free alternatives 
to Newton’s method, Quasi-Newton approaches entered FFT-based 
micromechanics in the form of Anderson acceleration (Shantraj et al., 
2015; Chen et al., 2019a;b) and the Barzilai-Borwein step size (Schneider, 
2019a), see Ch. 3 for further details. 


In contrast to the Lippmann-Schwinger solvers, which operate on dis- 
placements or compatible strain-fields, Eyre and Milton (1999) proposed 
an accelerated scheme with a polarization as primary variable. Initially 
formulated for conductivity problems, the Eyre-Milton method was 
adapted to linear elasticity by Michel et al. (2001) and proved to converge 
much faster than the basic scheme. In the same study, Michel et al. 
(2001) proposed an augmented Lagrangian version of the cell problem 
and solved it with ADMM. As an algorithm for constrained nonlinear 
optimization, ADMM appeared to share no connection to the Eyre- 
Milton method, which was motivated by series acceleration techniques 
and restricted to linear problems. Remarkably, for the case of linear 
elasticity, Moulinec and Silva (2014) identified both methods as members 
of a general family of polarization schemes by Monchiet and Bonnet 
(2012) and provided convergence estimates. The results were extended 
to the nonlinear setting by connecting the polarization methods to the 
classical Douglas-Rachford splitting (Schneider et al., 2019). Overall, 
for strongly convex problems, polarization methods combine excellent 
performance with a low memory footprint. However, the unclear choice 
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of algorithmic parameters for infinitely contrasted problems limits their 
flexibility compared to Lippmann-Schwinger solvers, see Ch. 6 for a 
detailed discussion and a proposed remedy. 

Based on these advances in discretization and solver technology, FFT- 
based schemes have found application in a wide variety of problem 
settings. Examples include polycrystals at small (Lebensohn et al., 2012) 
and finite strains (Eisenlohr et al., 2013), stress localization (Rollett 
et al., 2010), slip band formation (Marano et al., 2019; Marano and 
Gélébart, 2020), fatigue-lifetime estimation (Lucarini and Segurado, 
2019), damage (Boeff et al., 2015) and fracture mechanics (Chen et al., 
2019b), electro-mechanically coupled materials (Vidyasagar et al., 2017), 
the mantle flow of geophysical minerals (Castelnau et al., 2008), ho- 
mogenization of the elastic (Schneider, 2017b; Gorthofer et al., 2020) 
and rate-dependent (Staub et al., 2018) behavior of fiber-reinforced 
composites, the anisotropic thermoelastic behavior of explosive ma- 
terials (Gasnier et al., 2015) and concurrent multi-scale simulations 
(Kochmann et al., 2018; Göküzüm et al., 2019). For a broader overview 
of practical applications, we refer to (Schneider, 2021, Sec. 5). Segurado 
et al. (2018) and Lebensohn and Rollett (2020) provide reviews focus- 
ing on polycrystalline materials. An overview of modern multiscale 
approaches, where FFT-methods may enter as solver on the microscale, 
is given by Matouš et al. (2017). 


1.3 Originality and outline 


Chapter 2 This chapter briefly establishes the fundamentals of small- 
strain continuum mechanics, serving as the basic framework of this 
thesis. In particular, we review the kinematic assumptions, the under- 
lying balance equations and thermodynamic restrictions on the ma- 
terial behavior. On this basis, we revisit the periodic cell problem 
of computational micromechanics. The equivalent reformulations of 
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the problem in the form of the Lippmann-Schwinger equation and the 
Eyre-Milton equation are introduced, each serving as the starting point 
for distinct FFT-based solution algorithms. By embedding the problem 
in a variational framework, we draw the connection from FFT-based 
methods to classical solvers of convex optimization. Note that we do not 
claim any originality for the contents of this chapter. Instead, we seek 
to provide additional context and a basic framework for the following 
studies. 


Chapter 3 This chapter is devoted to investigating the power of Quasi- 
Newton methods in the context of FFT-based micromechanics. More 
precisely, we propose two novel algorithms exploiting the BFGS Hessian 
approximation, leading to fast tangent-free solvers. In this context, 
we discuss suitable line search criteria and forcing term strategies for 
inexact (Quasi-)Newton methods. In numerical experiments, we com- 
pare the performance and convergence behavior of the newly proposed 
algorithms to modern Lippmann-Schwinger solvers. The results reflect 
the strengths and weaknesses of the different algorithms and show 
which solvers excel for the special cases of computationally cheap and 
expensive material laws. 

Chapter 4 In contrast to the last chapter which dealt with general pur- 
pose methods, we consider the special case of small-strain single crystal 
elasto-viscoplasticity. Based on the observation that evaluating the 
inverse constitutive law is less costly for some formulations of the 
material model, we propose solving the associated cell problem in 
a stress-based framework. To this end, we revisit both the primal 
and dual variational setting and show their equivalence for arbitrary 
mixed boundary conditions. Numerical experiments demonstrate that 
the performance of FFT-based methods improves by about an order 
of magnitude with respect to computation time in the stress-based 
formulation. 
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Chapter 5 The interplay between temperature and deformation may 
have a significant impact on the effective behavior of microstructured 
materials under thermomechanical loadings. Based on the framework of 
asymptotic homogenization by Chatzigeorgiou, we propose an implicit 
staggered scheme for thermomechanically coupled problems which is 
compatible to arbitrary strain or displacement-based micromechanics 
solvers. Exploiting the homogeneity of the temperature on the mi- 
croscale, the proposed approach preserves the computational power of 
FFT-based schemes by introducing little overhead. As a particularly chal- 
lenging example with strong temperature sensitivity and pronounced 
thermomechanical coupling, we consider the case of glass-fiber rein- 
forced polypropylene to demonstrate the efficiency of our approach. 
Chapter 6 Having thoroughly investigated modern Lippmann-Schwinger 
solvers, we turn to polarization-based methods. In earlier studies, these 
algorithms have proven to be very fast and memory efficient, however, 
their use as general purpose solvers is limited by their sensitivity to the 
choice of algorithmic parameters. To tackle this problem, we propose 
combining polarization-based schemes with Anderson acceleration, 
resulting in a fast and robust algorithm which is competitive to the fastest 
Lippmann-Schwinger solvers. In particular, Anderson acceleration leads 
to a vastly improved convergence behavior for problems with infinite 
material contrast, where polarization-based schemes have typically 
struggled. 

Chapter 7 Following the previous method-driven chapters, we consider 
an application-oriented problem. More precisely, we use FFT-based 
methods to thoroughly investigate the creep behavior of cellular NiAl- 
Mo alloys. To this end, we build upon the studies by Albiez et al. (2016a) 
and Seemiiller et al. (2013) to formulate a surrogate model for the well- 
aligned creep resistant regions and generate suitable microstructures 
using a level set approach. The simulations shed light on the proper 
classification of soft intercellular regions, which are the root cause for 
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the notable loss of creep resistance in the cellular material. In addition, 
the impact of cell volume fraction and aspect ratio on the effective creep 
rate is identified, improving upon coarser analytical estimates. 
Chapter 8 Last but not least, we summarize our most important findings 
and close with some concluding remarks. 


1.4 Remarks on the notation 


In the present manuscript, newly introduced quantities are defined upon 
the first appearance in each chapter. Where appropriate, this includes 
the explicit expression and details such as function space, domain of 
definition and tensor order. Note that, in general, the latter information 
is not implicitly encoded in the notation, for instance, by specific typesets 
or markers. Tensor contractions are marked by dots, i.e., a single tensor 
contraction is denoted by -, a double tensor contraction reads : and :: 
is a quadruple tensor contraction. For instance, with scalars a, p, y, 
vectors u, v, second order tensors A, B and fourth order tensors C, D, 
the expression a = u: vis equivalent to a = uivi, u = A- v is equivalent 
to u; = A,,;v;, 8 = A: B is equivalent to 6 = A;;Bij, A= C : B is equiv- 


alent to Ajj = Cijxi Bei and y = C :: D is equivalent to y = CijxDijki, 
using the summation convention and index notation. The transposition 
of a second order tensor A is denoted by AT. I stands for the identity. The 
tensor product is defined by (u & v) - w = u(v- w) and its symmetrized 
version reads u @* v = 1/2(u®@v+v® u). Sym(d) stands for the space 
of symmetric second order tensors in R? and linear operators on Sym(d) 
are denoted by L(Sym(d)). Note that elements of L(Sym(d)), when 
interpreted as fourth order tensors, are endowed with the left and right 
minor symmetries. Throughout this manuscript, we operate in Carte- 
sian coordinates. Thus, for ease of exposition, we do not particularly 
emphasize the distinction between tensors and matrices in most of the 
text. Note, however, that in a broader continuum mechanics context, the 
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concept of tensors as basis-independent quantities cannot be neglected 
in general (Bertram, 2011). 
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Chapter 2 


Fundamentals 


2.1 Elementary continuum mechanics 


The following sections give a brief introduction to the theory of small- 
strain continuum mechanics, serving as the fundamental framework 
throughout this manuscript. Starting with basic kinematics, we specify 
the assumptions for the small-strain setting. Subsequently, the bal- 
ance equations, forming the basis of thermomechanical boundary value 
problems, are established. Last but not least, we discuss common ther- 
modynamical restrictions on material laws and introduce generalized 
standard materials as a convenient framework for material modeling. 
For further details on the continuum mechanical background, we refer to 
the monographs by Silhavy (1997), Liu (2002), Haupt (2002) and Bertram 
(2011). 


2.1.1 Kinematics 


Let Qo C R? be the space occupied by a body in an arbitrary reference 
placement. In this manuscript, we mostly consider three-dimensional 
problems, i.e., d = 3. The material points of the body are labeled by their 
reference position X € Qo (Šilhavý, 1997). The motion of the body is 
described by the bijective function 


X: Qo x [0,T] > R, (X,t)H x(X,t), (2.1) 
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which maps material points to their current position x 
a = x(X,t). (2.2) 
The associated current placement of the body reads (Silhavy, 1997) 
Qi = {x = X(X, t) | X € Qo}. (2.3) 


In general, any tensor field = on the material body may be parameter- 
ized in terms of the reference placement E; : Qo x [0, T] > RX xd 
(Lagrangian description) or in terms of the current placement =r : 
Qù x [0,7] + RI” *4 Eulerian description) with (Haupt, 2002) 


E(x, t) = E(x" (X, t), t), (2.4) 
aplat) = Ei(x(e,t),t). (2.5) 


For better readability, the subscripts are only written out where we wish 
to emphasize the parameterization. Otherwise, the parameterization is 
implied by the argument. 

The so-called material time derivative of a function is defined as the 
partial time derivative for a fixed reference placement X (Haupt, 2002) 


H= (2.6) 


Consequently, the velocity v and the acceleration a of the material body 
are given by 


v(X,t)=x(X,t), a(X,t) = X(X,t). (2.7) 


For a Eulerian field S£(x,t) the material dime derivative reads (Haupt, 


2002) 
mee, + Eat) la.) (2.8) 
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The first spatial derivative of the motion is denoted by F : Qo x [0,7] > 


R@*d and referred to as deformation gradient (Haupt, 2002) 


(Xd) = UK). (2.9) 


In particular, F maps infinitesimal line, area and volume elements dX, 
dA, dV, in the reference configuration Qo to the respective elements dz, 
da, dv current configuration Q; (Haupt, 2002) 


de=F:-dX, da=det(F)F-'- dA, dv=det(F)dV. (2.10) 


For physically meaningful deformations, it is generally assumed that 
det(F) > 0 to avoid compression to zero or even negative volume. 
Let Sym(d) stand for the space of symmetric second order tensors of 
dimension d and denote the associated subset of symmetric and posi- 
tive definite tensors by Sym‘ (d). Abusing notation, any deformation 
gradient F = F(X,t) may be split 


F=R-U=V-R (2.11) 


into a symmetric and positive definite part U, V € Sym*(d) and a 
proper orthogonal part R € SO(d) (Haupt, 2002). The left and right 
stretch tensors U and V share the same eigenvalues which are identified 
with the principal stretches. R refers to the mean rotation. 

In the undeformed state, the principal stretches are equal to one, i.e. 
U, V =I. In engineering, strain measures which are zero in the unde- 
formed state are commonly used. The family of Seth-Hill strains (Seth, 
1961; Hill, 1968), defined by 


ES = (U) (2.12) 
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and the scalar function 


2.13 
In(A) m=0, en 


covers many common strain measures, such as the Hencky strain 
(m = 0), Biot strain (m = 1) and Green strain (m = 2). 

To consider the geometrically linear setting (Liu, 2002), we introduce the 
displacement u : Qo x [0, T] + R? defined by 


u(X,t) = x(X,t) — X (2.14) 


and the associated displacement gradient H : Qo x [0, T] + R?*¢ 


Ou 
= — =F(X,t) -1. 2.1 
H(X,1) = 5 (X,t) = PX 1) 0.15) 
For small deformations, it is assumed that the Frobenius norm || - || for 


all displacement gradients H = H(X, t) is small 
|H] <1. (2.16) 


Linearization around H = 0 yields Liu (2002) 


FSeth — g, (2.17) 
U =I+e, (2.18) 
R=I-4w, (2.19) 


with the infinitesimal strain e 


e= L +H") (2.20) 
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and the infinitesimal rotation w 


w= SH =H") (2.21) 


as symmetric and skew-symmetric parts of H, respectively. In addi- 
tion, it is typically assumed that the displacement is small as well, so 
that x ~ X. Thus, the distinction between Lagrangian and Eulerian 
parameterization vanishes and the material time derivative reduces to 
the partial time derivative (Haupt, 2002). 


2.1.2 Balance equations 


The (thermo)mechanical behavior of a body is governed by physical 
laws in the form of balance equations. For specified external loadings, 
these equations give rise to boundary value problems which may, in turn, 
be solved either analytically or numerically. The general integral balance 
of an arbitrary tensor field = over any regular bounded subregion of a 
body P, C Q, with boundary OP, reads (Liu, 2002) 


a sav = | andat | pz + sz dv, (2.22) 
ðP; P, 


where the non-convective flux qz of is one tensor order above = and 
the internal production pz and external supply sz have the same tensor 
order as &. Applying Reynold’s transport theorem and the divergence 
theorem yields the local form in regular points 


= 


+ av (B® v) = div qz +p +s, (2.23) 


as (2.22) has to hold for arbitrary P, (Liu, 2002). For simplicity of 
exposition, we do not consider singular surfaces and the associated 
jump conditions. 
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Mass For mass conservation, 5 is identified with the mass density p : 
Q © [0,7] + Rand production, supply and flux are zero (Liu, 2002). 
Thus, the local balance reads 


p+ pdiv v = 0. (2.24) 


Note that in continuum solid mechanics, the balance of mass is typically 
not explicitly considered. Indeed, for given deformation gradient and 
mass density pọ in reference configuration, the current mass density may 
be computed by p = det(F)~'po. Similarly, in the small-strain context 
p = (1—tr(e))po0 holds. However, owing to (2.16), the density is often 
approximated as constant in time p ~ po. 

Linear and angular momentum With the linear momentum density pv 
as balanced field, the volume force density b : Q; x [0,7] + R? as supply 
term and the Cauchy stress tensor o : Q, x [0,7] > R?** as flux, the 
balance of linear momentum reads (Liu, 2002) 


pa = div o + b. (2.25) 


Note that, in this manuscript, we restrict to the quasi-static setting where 
the acceleration term vanishes. Under the assumption that the balance 
of linear momentum (2.25) holds, the balance of angular momentum 
may be condensed to 

o=0", (2.26) 


i.e., the symmetry of the stress tensor o(x,t) € Sym(d) (Liu, 2002). 


Energy The total energy density is comprised of the internal energy 
density'e : N. x [0, T] — Rand the kinetic energy 1/2 pv - v. Thus, the 


From a physical viewpoint, modeling the mass specific internal energy € = e/p is 
preferable. However, in a small-strain context, where p may be approximated as a 
constant conversion factor, using the volume density e is more convenient. The same 
holds for the entropy s and the free energy w. 
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conservation of energy, known as first law of thermodynamics, reads 
ae fT . 
é+ zP: o) =b- v +w +div (o -v)-—divq, (227) 


where the supply term consists of internal heat sources w : Q; x [0, T] — 
R and the power of the volume forces b - v and the flux is given by the 
negative heat flux —q and the mechanical power v - o - n. By subtracting 
the balance of linear momentum (2.25) multiplied by the velocity, the 
conservation of total energy may be condensed to the balance of internal 
energy 

e=-divg+tw+te:e. (2.28) 


Entropy With the entropy density s : Q; x [0,7] — R, the generic en- 
tropy balance reads 
å = div qs + Ps + Ss- (2.29) 


The second law of thermodynamics 


states that the entropy production may never be negative, thereby 
restricting the direction of physical processes (Lebon et al., 2008). Note 
that thermodynamical theories sometimes differ in their assumptions 
on the flux q, and supply ss as either fixed or constitutive quantities, 
see Lebon et al. (2008) or Cimmelli et al. (2014) for an overview. In the 
next Sec. 2.1.3, we follow Coleman and Noll (1963) in the context of 
rational thermodynamics. Note, however, that more general approaches 
for exploiting the entropy balance exist (Liu, 1972). 


2.1.3 Thermodynamic restrictions 


The balance equations (2.24) - (2.29) are assumed to hold universally. For 
predicting the (thermo)mechanical behavior of specific materials, their 
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properties have to be encoded in the form of constitutive equations for 
the energies and fluxes (Liu, 2002, Sec. 8.4). In the framework of rational 
thermodynamics, the second law of thermodynamics is interpreted as a 
restriction on these constitutive relations, i.e., the material laws should be 
formulated so that (2.30) holds identically. Material models conforming 
to this restriction are called thermodynamically consistent. For evaluating 
the implications of the second law, Coleman and Noll (1963) proposed 
systematic approach based on the Clausius-Duhem inequality, which has 
been widely adopted in modern continuum mechanics. In the following, 
we give a brief summary for the case of solids with internal variables 
at small strains. Coleman and Noll (1963) rely on the constitutive 
assumptions ss = w/0 for the entropy supply and qs = —q/6 for the 
entropy flux, where 0 : Q; x [0, T] — Ryo is the absolute temperature. For 
this specific formulation of the entropy balance, equations (2.28)-(2.30) 
may be combined to yield the Clausius-Duhem inequality 


1 
05- ė+0o:È— ga VOO (2.31) 


with the temperature gradient V8 : Q, x [0, T] — R@. Let z : Q; x [0, T] > 
Z denote an array of internal variables with an associated vector space 
Z which is assumed to be sufficiently large. For specifying the material 
behavior, we assume that the free Helmholtz energy w, related to the 
internal energy by 


e=v+t+9s, (2.32) 

has the form 
Y : Sym(d) x Ryo x R? x Z > R, (2.33) 
(c, 0, V0, z) + W(e, 0, VO, z). (2.34) 
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Inserting the free Helmholtz free energy (2.32) into the Clausius-Duhem 
inequality (2.31) yields 


(<- sole. V0.2) én (Srle4. vo, 2) +5) ô 


Op 
Ove 


(2.35) 
(6,8, 79,2): VÒ- 2 (6,6, 98,2): 19: VO 20 
0z??? 60 =? 
assuming that 4 is sufficiently smooth in all arguments. Note that (2.35) 
has to hold for arbitrary physical processes. As, in principle, any path 
may be realized for &, Ô and VÖ by choosing suitable (experimental) 
boundary conditions, the terms linear in these quantities must vanish. 
In particular, the free energy is independent of the temperature gradient 


Ow 


ag (60, VO, 2) = (2.36) 


and, therefore, V@ is removed from the argument list in the following. 
In addition, we obtain potential relations for the stress 


OY 


c= Ba 0, z) (2.37) 
and entropy 
_ OW 
= 0,z 2. 
s 59 —(e, 4, z). (2.38) 
For simplicity, the terms in the residual inequality are commonly treated 
separately 
ee: 0,2)-2>0, (2.39) 
Oz 


—q -V0 > 0, (2.40) 


23 


2 Fundamentals 


where, in the spirit of linear irreversible thermodynamics, the heat flux 
term may be covered by assuming Fourier’s law 


q=—KVO (2.41) 


with a positive definite thermal conductivity tensor k € Sym” (d). To 
conclude, suitable evolution equations, respecting the inequality (2.39), 
have to be supplied for the internal variables in addition to a free energy 
w to complete a thermodynamically consistent material model. 


2.1.4 Generalized standard materials 


A widely adapted framework for thermodynamically consistent mate- 
rial models is the two potential formulation of generalized standard 
materials (GSMs) (Halphen and Nguyen, 1975; Germain et al., 1983). A 
GSM is described by a convex free energy w (2.32), and a convex and 
non-negative dissipation potential 


@:Rso x ZARso, (0,2) + G(9, 2). (2.42) 
with (6,0) = 0. The relation between the dissipation potential ¢ and 
the thermodynamical driving forces A = —0w/0z(e, 0, z), living in the 
continuous dual space Z* of Z, is expressed via Biot’s equation 

A € 0:9(8, 2). (2.43) 


Here, 0:¢ stands for the subdifferential of ¢ with respect to 2, defined by 


8:0(0,2) = {A € Z* | 60,9) — 910,24) 2A-(Y— 2), YY E Z}, (2.44) 
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see (Rockafellar, 1970, Sec. 23). Thus, using our initial assumptions on ¢ 
and choosing ý = 0, the above definition (2.44) yields 


A:22 6(8, 2) 2 0, (2.45) 


demonstrating that the residual inequality (2.39) holds. Equivalently, 
GSMs may be formulated in terms of the force potential 


so that the evolution equations are given explicitly by 
2 € Opa lA, 0). (2.47) 


In addition to being thermodynamically consistent, GSMs enjoy the 
property that, after a backwards Euler time discretization and condensa- 
tion of internal variables, they permit expressing the stress o in terms of 
a condensed incremental potential w : Sym(d) x Ryo 

Ow 


a = 5,(&9), (2.48) 


which does not depend on z (Lahellec and Suquet, 2007). Thus, for a 
fixed time step, a GSM effectively behaves like a nonlinear hyperelastic 
material. Last but not least, a few synoptic remarks are in order: 

e In the later Chapters 3-6, we discuss FFT-based micromechanics 
solvers in the framework of convex optimization, assuming that the 
stress operator derives from a convex potential. Thus, the property 
(2.48) is particularly convenient as all GSMs are covered by the theory. 


Note that the condensed potential w carries no physical significance, 
as it arises as a mixture of free energy, dissipation potential and 
time discretization. Hence, w is usually not explicitly formulated 
or evaluated. 
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e The GSM framework covers a wide range of material models, such 
as classical J2-plasticity (Simo and Hughes, 1998) or certain types of 
crystal plasticity models, see Sec. 4.3.2 or Fritzen and Leuschner (2013). 
However, it is far from universal and many widely adapted models do 
not adhere to the two-potential formalism. When discussing specific 
material models in the later chapters, we indicate cases which are not 
covered by the theory. 


2.2 FFT-based micromechanics 


This section introduces the basic problem setting for computational 
micromechanics at small strains, providing the background for the 
algorithms proposed in Ch. 3 - Ch. 6. In particular, we discuss two 
reformulations of the periodic cell problem, the Lippmann-Schwinger 
equation (Zeller and Dederichs, 1973) and the Eyre-Milton equation 
(Eyre and Milton, 1999), each giving rise to a distinct family of FFT-based 
solution schemes. In both cases, we interpret the respective methods 
in the framework of convex optimization, which serves as the natural 
setting for FFT-based solvers throughout this manuscript. For brevity of 
exposition, we restrict to the continuous setting. Please note, however, 
that the choice of discretization (Moulinec and Suquet, 1998; Willot, 2015; 
Schneider et al., 2016) constitutes an important topic in and of itself, with 
substantial repercussions on the convergence behavior for problems with 
infinite material contrast (Schneider, 2020b). For a thorough overview 
on state-of-the-art FFT-based micromechanics, we refer to the review by 
Schneider (2021). 


2.2.1 Cell problem 


Based on the framework of small-strain continuum mechanics, we 
specify the periodic cell problem for computing the effective response 
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of heterogeneous materials. For clarity of exposition, we restrict to the 
purely mechanical setting, i.e., we implicitly assume that the temperature 
field is homogeneous and constant in time and suppress the temperature 
dependence of all quantities. The extended framework for thermome- 
chanically coupled materials by Chatzigeorgiou et al. (2016) is summa- 
rized in Sec. 5.2. Let Y = [0, L]? be a periodic cell on the microscale 
and denote the position vector by « € Y. The material distribution in 
the cell, i.e., the microstructure, is encoded in the heterogeneous stress 
operator o : Y x Sym(d) > Sym(d), (x,¢) > o(z,e). We consider the 
vector space of periodic and mean free displacement fields 
Hy(Y;R*) = {u € H'(Y;R®) | 


(2.49) 
u periodic, 0,,u anti-periodic on OY, (u)y = 0}, 


where (-), = 1/|Y| f,-(-) dv denotes volume averaging over Y. Fora 
prescribed macroscopic strain =, we seek a solution u € H,(Y; R®) to 
the quasi-static balance of linear momentum on the microscale 


div o(-,= + V*u) = 0, (2.50) 


where the volume forces vanish as a result of asymptotic homogenization 
(Bakhvalov and Panasenko, 1989). Given a solution u to (2.50), the 
macroscopic stress computed by 7 = (o(-,2+ V*u)),- constitutes the 
effective mechanical response of the material to the loading Z. For the 
convenience of the reader, we restrict our exposition to pure strain 
boundary conditions, see Ch. 4 for the case of mixed boundary condi- 
tions following Kabel et al. (2016). 


2.2.2 Lippmann-Schwinger equation 


In the context of FFT-based micromechanics, many successful algo- 
rithms are based on an equivalent reformulation of (2.50), the so-called 
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Lippmann-Schwinger equation (Zeller and Dederichs, 1973). As a start- 
ing point, consider the elastic problem 


div C° : V8u = —b (2.51) 


with homogeneous stiffness tensor C° € L(Sym(d)) and a mean-free 
right hand side b € Ha; R@) in the space of (volume) forces. For 
solving (2.51), we express u and b as Fourier series 


u(x) = > ai(é)exp(ia-€, d(x =X kė b(€) expli z- €), (2.52) 


gezd gezd 


with € = 2r&/L. Recalling that the Fourier coefficients of the divergence 
of a tensor field A and the symmetrized gradient of a vector field v are 
given by 


div A(é) = i Â(€)-Ë and Vev(g)=i€@*0(E), (2.53) 
respectively, the homogeneous problem (2.51) reads 


b(€) = [C : (E€ 8° a(€))] -€ (2.54) 


in Fourier space. For an isotropic stiffness tensor C° with Lamé constants 
Ao and wo, equation (2.54) may be rearranged to 


ag) = 


( 1 ko+Ao E@E 


; a “b . 2.55 
nollEl? Ho(2uo + Ao) en) (6), € #0 (2.55) 


Hence, the solution operator G° associated to (2.51), i.e., 


divC’:Vu=-b iff u=—G®-b, (2.56) 
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admits the Fourier space representation (Mura, 1987) 


P 1 -I potào E@E 0, 
G®(é) = (ater Ho (240+o) ist) oF (2.57) 
0 €=0. 


For simplicity, we restrict to reference materials which are a multiple 
of the identity, i.e., C° = ag I with io = ao/2 and ào = 0. Subtracting 
div C° : Vu on both sides of equation (2.50), the original problem can 
be recast in the form of (2.54) 


div €? : Vou = —div [o(-,2+ Vu) — C° : (E€ + Vu)] (2.58) 


with b = div [o(-,2+ V®u) — C° : (€ + VSu)]. Thus, using the property 
(2.56), the solution u of (2.50) may be expressed by 


u = —G°div [o(.,e + Vou) —C°: (E+ Veu)]. (2.59) 


Taking the symmetrized gradient of (2.59) and adding the macroscopic 
strain £ yields the Lippmann-Schwinger equation 


e=E-T’:l(o(,e)-C’:e) (2.60) 


where the total strain € = € + V°u and the operator T° = V°G°div are 
introduced. The original FFT-based solver, the basic scheme by Moulinec 
and Suquet (1994; 1998), is the fixed-point iteration associated to (2.60) 


Eras =E—I®: (o(-,e4n) —C° 2 eg), (2.61) 


where T is applied in Fourier space. 
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2.2.3 Variational framework 


Under the assumption that the stress operator ø is derived from a 


(condensed) potential 
Ow 


0 = Be 
E 
the cell problem may be embedded in a variational framework. In the 


(2.62) 


following, we briefly establish the strain-based minimization problem 
(Bellis and Suquet, 2019) and its relation to the basic scheme (2.61). 
Please note that an equivalent description in terms of displacement 
fluctuations is possible (Schneider, 2017a) and enables more memory 
efficient implementations (Kabel et al., 2014). Consider the space of 
compatible strain fluctuations 


U = {ê € L?(Y;Sym(d))|@=V’u, ueHy(Y;Sym(d)), (é)y =0} 
(2.63) 


as a subset of all periodic and square integrable stress and strain fields 
L?(Y;Sym(d)) with the associated inner product 


($,T)r2=(S:T)y, STE L?(Y;Sym(d)). (2.64) 
We seek a minimizer of the mean strain-energy 


W(é) = (w(,E+2)), — min. (2.65) 


DW(@)[S] = (P:0(,F+8):S)y SEU, (2.66) 


where T = V°(div V°)"!div is the projector upon U by the Helmholtz 
decomposition, see App. A. For the chosen (sub)space U with inner 
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product (2.64), the gradient is defined by 
DW (e)[S] = (VW (ê), S)z2, YS €U, (2.67) 


hence, we obtain 
VW (ê) =T :o(,€ +ê). (2.68) 


The condition for critical points of W 
T': o(-,¢) =0 (2.69) 


is equivalent to the quasi-static balance of linear momentum on the 
microscale (2.50), thereby recovering the cell problem. Interpreting 
FFT-based micromechanics as an optimization problem has several 
immediate advantages. Firstly, Kabel et al. (2014) noted that the gradient 
descent iteration with step size yk 


Ek+1 = Ek = yk : a(-, Ek) (2.70) 


associated to (2.65) is precisely the basic scheme by Moulinec and Suquet 
(1994; 1998) with C° = 1/7* I. This elucidates the role of the reference 
material C° as an algorithmic rather than a physical parameter and 
clarifies its optimal choice. Indeed, for a strongly convex energy w 
with an L-Lipschitz gradient, i.e., 


O(:,€1) — O(-, €2),€1 — €2) L2 = E1 — £2 ||22 
(o(+,€1) — @(+,€9), 60 ea) È y || £1 — €2 ||} (2.71) 


llo(-, €1) — o(-, €2)||z2 < Llle1 — ea ||r2, 


for all £1, £2 € L?(Y;Sym(d)), the optimal reference material (Nesterov, 


2004, Sec. 1.2.3) reads 

etl I 
== L 
generalizing the choice for the linear elastic setting (Moulinec and 


C? (2.72) 


Suquet, 1998). Secondly, in the variational framework, well-established 
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algorithms for convex optimization (Boyd and Vandenberghe, 2004; 
Nocedal and Wright, 1999), improving upon the performance of simple 
gradient descent, become available for FFT-based micromechanics, see 
Ch. 3. 


2.2.4 Eyre-Milton equation and polarization-based schemes 


Eyre and Milton (1999) proposed an equivalent reformulation of the 
Lippmann-Schwinger equation in terms of a positive polarization 
P = o(-,£) + C° : £ as primary variable, giving rise to a separate class of 
FFT-based methods. The Eyre-Milton equation reads 


P- Y°: Z'(P)=2C:z (2.73) 
with the nonlocal operator 
Y° = 1 —2C° : T°, (2.74) 
which is readily applied in Fourier space, and the local operator 
Z? = 1—2 sie, (2.75) 


leading to a similar structure compared to the Lippmann-Schwinger 
equation (2.60) with nonlocal operator T° and local operator o — C°. We 
emphasize that o in (2.75) denotes the stress operator and is not to be 
confused with the stress field, i.e., applying 


e= (o +C?) HP) (2.76) 
is equivalent to solving 


P=o(,c)+C’:e (2.77) 
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for e € L?(Y;Sym(d)). By noting that Z° translates the positive polariza- 
tion P = o(-,¢) + C° : e to the negative polarization T = o(-,e) — CP: e, 
the equivalence of the Eyre-Milton equation (2.73) and the Lippmann- 
Schwinger equation (2.60) is readily established 


P-Y’: Z'(P)=2C: 
& P-(I—20°:T°):r=2C°: 
& P-r4 20: IrCC.: 
> e+.: r=: 


(2.78) 


o ml al 


For linear problems (Eyre and Milton, 1999; Michel et al., 2001), the 
fixed-point iteration associated to (2.73) 


Prrı = 20° : € + Y° : Z°(P,) (2.79) 


and damped versions thereof (Monchiet and Bonnet, 2012; Moulinec 
and Silva, 2014) were found to converge much faster than the basic 
scheme (2.61) for a suitable choice of C°. Similar to the basic scheme, the 
extension to inelastic problems was facilitated by connecting the Eyre- 
Milton scheme (2.79) to classical operator splitting methods (Peaceman 
and Rachford, 1955; Douglas and Rachford, 1956), see Schneider et al. 
(2019). 

For some Hilbert space V and a function f : V > R, x + f(x) which 
admits the representation f(x) = g(x) + h(x), the Peaceman-Rachford 
iteration (Peaceman and Rachford, 1955) associated to the minimization 


problem 
f(a) min (2.80) 
reads 
zur = PAHV) - JAHA) -zs (281) 
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with the iterate z, = (I+yVh)x, and a step size y. Note that for an 
indicator function of a convex set C 


iets f Ee (2.82) 
oo, «EC, 


the operator (I+70vc)~' is equivalent to the projector Pc upon C, see, 
for instance, Combettes and Pesquet (2011). To establish the connection 
to the Eyre-Milton scheme, consider the reformulation of problem (2.65) 


i 2; 
(Dy + Hcl) cP cay 


with the indicator function ıv. for the space of compatible strain-fields 
adhering to the prescribed boundary conditions 


U. = {e € L*(Y;Sym(d))|e=F+V°u, ue H}(Y;Sym(d))}. 
(2.84) 


In analogy to the space of strain-fluctuations U (2.63), the projector 
Pu.(e) =E+T : € upon; is derived from the Helmholtz decomposi- 
tion. Thus, the Peaceman-Rachford iteration (2.81) associated to the 
problem (2.83) with h = (w), and g = uy, reads 


zk+1 = 2 + (20 — D21 +70)? — Tz. (2.85) 


Upon multiplication of (2.85) with the reference material C° = 1/7I the 
Eyre-Milton iteration (2.79) is recovered 


Pai = 2C° : g + (2C°T® — 1) (2(C° + o)~* — I) Ph. (2.86) 
= N 
=—y? =-Z' 
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Hence, the convergence analysis for splitting methods by Giselsson and 
Boyd (2017) carries over to polarization-based schemes and the optimal 
choice for the reference material is 


Co = yuL1, (2.87) 


for strain energies satisfying (2.71). However, note that, in contrast to the 
reference material of the basic scheme (2.72), this choice (2.87) becomes 
ill-defined for cases where j tends to zero, such as perfect elastoplasticity 
or porous materials. A strategy for circumventing this disadvantage is 
presented in Ch. 6. 
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Chapter 3 


On Quasi-Newton methods in 
FFI-based micromechanics' 


3.1 Introduction 


In the context of FFT-based computational homogenization, Newton’s 
method was combined with the conjugate-gradient (CG) solver in the 
small- (Gélébart and Mondon-Cancel, 2013) and finite-strain setting 
(Kabel et al., 2014) and exhibited excellent performance. Due to the small 
number of required function evaluations, these schemes proved to be 
particularly powerful for problems with computationally expensive 
material laws, such as single-crystal plasticity (Shantraj et al., 2015; 
Lucarini and Segurado, 2019; Ma and Truster, 2019), whose evaluation 
dominates the overall runtime. However, in contrast to gradient-based 
methods, the Newton-CG solver requires the evaluation of the material’s 
tangent stiffness for each voxel. This procedure can be computationally 
expensive for some material laws. Furthermore, the analytic deriva- 
tion of the tangent can be tedious and its implementation may require 
considerable programming effort, and is thus prone to errors. This 
gave rise to applying Quasi-Newton methods in FFT-based microme- 


1 This chapter is based on Wicht et al. (2020b). For the sake of a coherent structure, 
formatting and typography of this thesis, minor changes have been made. To avoid 
redundancies in the text, the introduction has been shortened. 
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chanics. Quasi-Newton schemes rely upon an approximation of the 
Hessian by generalizing the one-dimensional secant method and are 
thereby tangent-free (Nocedal and Wright, 1999). Schneider (2019a) 
used the Barzilai-Borwein method (Barzilai and Borwein, 1988), which 
approximates the Hessian by a multiple of the identity, to accelerate 
Moulinec-Suquet’s basic scheme. Shantraj et al. (2015) pioneered using 
Anderson acceleration (Anderson, 1965) in an FFT-based context. The 
algorithm is included in the software DAMASK Roters et al. (2018) as the 
non-linear GMRES method. More recently, Chen et al. (2019a;b) success- 
fully adapted the Anderson acceleration to simulate damage initiation 
and brittle fracture. Originally developed to accelerate general fixed- 
point iterations, Anderson acceleration was linked to Quasi-Newton 
schemes by Fang and Saad (2009). More precisely, it was identified as a 
generalized multisecant form of the second Broyden method (or "bad 
Broyden method") (Broyden, 1965) which approximates the Hessian 
in terms of a number m (called depth) of past iterates and gradients. 
Recently, Evans et al. (2020) proved that Anderson acceleration improved 
the first-order convergence rate for fixed-point iterations. Pollock and 
Rebholz (2021) extended the analysis to the non-contractive setting and 
provided sharper residual bounds. 

Motivated by the mentioned work on Quasi-Newton methods, we focus 
our attention on the powerful and popular Broyden-Fletcher-Goldfarb- 
Shanno (BFGS) algorithm (Broyden, 1970; Fletcher, 1970; Goldfarb, 1970; 
Shanno, 1970). We revisit its basics in the framework of (inexact) Newton 
methods in Sec. 3.2. Both Newton and Quasi-Newton methods require 
appropriate globalization strategies to ensure global convergence. Often, 
this is realized by a backtracking line search using appropriate conditions 
for the acceptance of the step size. However, applying the classical Wolfe 
conditions (Wolfe, 1969) to FFT-based micromechanics is not feasible, 
as function evaluations are not available in this setting in general, since 
the condensed potential (Lahellec and Suquet, 2007) of the material law 
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carries no physical meaning and is therefore not computed. Hence, 
we propose using the line-search conditions proposed by Dong (2010), 
which solely rely upon gradient evaluations, see Sec. 3.2.3. Another 
aspect which is of major importance for the overall performance of 
inexact (Quasi-)Newton methods is the choice of the forcing term, i.e. the 
accuracy to which the linear system is solved. To this end, we revisit the 
forcing-term strategies of Eisenstat and Walker (1996), see Sec. 3.2.4. In 
Sec. 3.3, we turn our attention to Newton and Quasi-Newton methods as 
applied in the context of FFT-based micromechanics. After revisiting the 
Newton-CG method and the Anderson acceleration, two possible uses of 
the BFGS update formula in the FFT-based setting are proposed. First, we 
investigate the limited-memory version of the BFGS algorithm (L-BFGS) 
by Nocedal (1980) which only stores the m last differences of iterates and 
gradients for its Hessian, similar to the Anderson acceleration. A second 
algorithm is derived, using the BFGS-update formula to approximate 
the local material tangent for every voxel instead of the Hessian of the 
global system. In analogy to the Newton-CG method, the resulting 
linear system is solved using conjugate gradients. Hence, we refer to the 
method as BFGS-CG. Last but not least, we compare the performance 
of the investigated solution algorithms and the impact of the different 
forcing-term choices for non-linear problems with finite and infinite 
material contrast, see Sec. 3.4. 


3.2 Newton and Quasi-Newton methods 


3.2.1 Newton’s method 
Let V be a Hilbert space with an associated inner product V x V —> R, 


(x,y) + (z,y)v and the induced norm ||z||v = y(z,x2)v. Suppose 
a twice continuously differentiable function f : V — R is given. Its 
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gradient Vf: V — V is defined by 
Dfl] = (V f(e), vv, vev, (3.1) 


where Df : V — V’ denotes the differential of f and V” is the continuous 
dual of V. For a minimization problem 


f(x) — min, (3.2) 


critical points of f are characterized by 
Vif(z) = 0. (3.3) 


Newton’s method iteratively updates an initial guess xo € V by the 
formula 
Intl = Tn +n, Where én €V 


(3.4) 
solves DV f (an) [En] = —V f (£n), 


and DVf : V — L(V, V) denotes the Hessian of f and and L(V, V) 
denotes the space of linear mappings V — V. Let x“ € X be a solution 
to (3.3). Suppose that DV f (x*) is an isomorphism and DV f is Lipschitz 
continuous in a neighborhood of x*. Then, if xo is sufficiently close to 
x*, the Newton iteration (3.4) converges, and if DV f is locally Lipschitz, 
it does so with quadratic rate (Kantorovich, 1948). 

To obtain global convergence, the Newton iteration (3.4) has to be 
modified, for instance by damping, i.e., with a, € (0, 1], 


Ln41 = Zn + Anén, Where &€V 


(3.5) 
solves DV f(a@n)[En] = —V f (an). 


The damping factor a,, is chosen by a line search procedure, for instance 
by an approximate line search involving the Wolfe (1969) conditions 


f (Qa + Ann) < fan) T C1 An (V flan); En)v (3.6) 
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and 
(Vf (an + nEn); En)V = C2 (Vflan); En)v (3.7) 


for fixed constants 0 <cı <a < 1. 

For large scale applications, the equation DV f(zn)[&n] = -V f (£n) 
for the Newton increment can often only be solved iteratively up to 
a prescribed precision, leading to an inexact, damped Newton method 


In+1 = In + AnEn, where En eV 


(3.8) 
solves |DV f(zn) En] + VF(tn)Ilv < mllVF (en) |lv- 


The choice of 7, is crucial, as its order of convergence (as n — oo) is 

linked to the convergence of £n to x*, see Dembo et al. (1982). More 

precisely, if 7, is uniformly less than one, x, converges to «* linearly. Fur- 
thermore, assuming Lipschitz continuity of DV f(x„) in a neighborhood 
of £*, m < Clan — x*||v is necessary to obtain quadratic convergence. 

However, “asymptotic quadratic convergence is achievable, but only 

with effort on the part of the inner, linear iterative method, which is 

usually unwarranted when overall time to solution is the metric”, see 

Knoll and Keyes (2004). General-purpose strategies for the choice of 

nn were proposed by Eisenstat and Walker (1996) and are discussed in 

Sec. 3.2.4. 

Despite the computational power of Newton’s method, there are several 

practical disadvantages. 

1. Programming the second derivatives of a function can be tedious, 
and doing it efficiently is often challenging. These problems can be 
partly overcome by automatic differentiation techniques (Griewank 
and Walther, 2008). 

2. If V is m-dimensional and the equation for the Newton increment is 
solved directly, O(m*) operations are required. For large m, this can 
be excessive. If the Hessian is sparse, iterative solvers can be used to 
reduce the computational complexity to O(m?). 
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3. For inexact Newton-methods, the optimal choice of the Newton 
forcing term {7,} in (3.8) is difficult. Although general purpose 
strategies have been developed (Eisenstat and Walker, 1996), the 
following problem remains. Suppose you wish to find a ö-critical 
point, i.e. to find a solution to the inequality 


IVFl@)llv < ô 


and your current iterate x, almost satisfies the inequality. How 
accurate do you have to solve for the increment to ensure that £n+1 is 
ö-critical? 
Points 1 and 3 motivated the development of Quasi-Newton methods 
which we shall discuss next. 


3.2.2 From Newton to BFGS 


Quasi-Newton methods replace the Hessian DV f(x) in the linear 
equation 
DV f(zn)[&n] = -V f (£n) (3.9) 


by an approximation B,, which is required to fulfill the secant condition 


Yn = Baai Sn, 


where Sn = Intı — Tn, (3.10) 


and yn = Vf(%n41) — VF (an). 


Among the most powerful Quasi-Newton methods is the Broyden- 
Fletcher-Goldfarb-Shanno (BFGS) algorithm (Broyden, 1970; Fletcher, 
1970; Goldfarb, 1970; Shanno, 1970), which recursively updates an ap- 
proximation of the Hessian 


Un & (Yn, Jv Brsn ® Bruns Yv 


Bn41 = Bn + 
= (Yn; Sn) V (Sn, Bn§n)v 


(3.11) 
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for a given Bo € L(V, V). If the operator Bo is self-adjoint and positive 
definite, the subsequent B,, € L(V, V) will inherit the symmetry and 
positive definiteness property. Alternatively, an update formula corre- 
sponding to (3.11) is available for the inverse of the Hessian H,, = B} ' 


An+41 = (1-® ® er) He (1-® 8 wd) 


(Yn, Sn) v lna Sn)V 
Sn Q (Siis þv 
(Yn, Sn) v l 


(3.12) 
+ 


With this formula at hand, €,, = —H,V f (£n) can be computed without 
solving the linear system (3.9). Thus, the damped BFGS method may be 
rewritten 

Ln41 = Ln — On AnV S (an). (3.13) 


Global superlinear convergence of the BFGS method (3.13) with inexact 
line search respecting the Wolfe conditions (3.6) and (3.7) and uniformly 
convex and Lipschitz-continuous objective functions in finite dimensions 
has been established by Powell (1976). In the general Hilbert space 
setting, only linear convergence (Turner and Huntley, 1976; Griewank, 
1987) can be expected, see Griewank (1987) for counterexamples. If 
the Hessian at the critical point z* and the inverse Hg * of the initial 
approximation of the Hessian differ by a compact linear operator, super- 
linear convergence can be established (Griewank, 1987). More generally, 
superlinear convergence is characterized by Dennis and Moré (1977). 
However, their criterion is difficult to verify for a particular problem at 
hand. 

The BFGS method still keeps the Hessian (or its inverse) in memory. 
In particular, due to the rank-two update, B, quickly becomes fully 
populated, restricting the method’s utility for large scale applications. 
Nocedal (1980) introduced a limited-memory variant of BFGS (L-BFGS) 
depending on a positive integer m, such that only the last m differences 
of iterates sn and gradients yn are kept in storage for updating the 
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inverse Hessian. More precisely, for any n, and l = 0,...,m-—1, Nocedal 
proposed the formula 


He! = (1 Yn-1 Q (Sri x) Ami (1 Sn- ® (Yn—1s x) 


(Yn, Sn—1)V Á (YUn-1, $n-ı)V 
+ Sn-1® (Sni; þv 
(Um, Sn-ı)V 


(3.14) 


for some initial approximation H, a and where we formally set yn and 
Sn to zero for n < 0. Typically, the initial approximation is chosen as 
a multiple of the identity H, 0 = @,, I. A common choice for the scaling 
factor is given by 0n = (Sn—1, Yn-ı)v/(Yn-1,Yn-ı)v, See Shanno and 
Puah (1978) and Liu and Nocedal (1989), corresponding to the Barzilai- 
Borwein stepsize (Barzilai and Borwein, 1988). The damped L-BFGS 
iteration reads 

Tni = Ln — On HV f (an). (3.15) 


How to implement the update (3.15) in the context of FFT-based 
micromechanics is discussed in Sec. 3.3.3. For strongly convex and 
Lipschitz-continuous objective functions, convergence of L-BFGS under 
the Wolfe conditions (3.6) and (3.7) in finite dimensions V was estab- 
lished by Liu and Nocedal (1989). In contrast to BFGS, the convergence 
to x* is only linear. 


3.2.3 The line-search procedure of Dong 


Global convergence of Newton’s method and (L-)BFGS depends on a 
flexible line-search procedure. Exact line search is typically infeasible 
in practice, because evaluating the gradient of the objective function 
involves non-linear, and often quite costly, operations. Thus, approxi- 
mate line-search procedures ensuring sufficient decrease per iteration 
are mandatory, involving, for instance, the Wolfe conditions (3.6) and 


44 


3.2 Newton and Quasi-Newton methods 


(3.7). In particular, using the Wolfe conditions as criterion for the line 
search is crucial for ensuring global convergence of the (L-)BFGS method. 
Satisfying the Wolfe conditions guarantees that the curvature condition 


(Yn; Sn)v > 0 (3.16) 


holds, which is necessary for the positive definiteness of the iterates Bn, 
see Sec. 6.1 in Nocedal and Wright (1999). 

For FFT-based micromechanics (to be discussed in Sec. 3.3), function 
evaluations are not available, in general. The reason is that, in contrast 
to the stress, the Helmholtz free energy or the dissipation potential, the 
condensed potential f for the non-linear material law, relating strains 
and stresses, has no physical meaning (because it depends on the time 
discretization and mixes the Helmholtz free energy and the dissipation 
potential). In particular, the Wolfe condition (3.6) cannot be evaluated 
per se. As a workaround, Dong (2010) proposed to replace the first Wolfe 
condition (3.6) by the inequality 


(VF in + Andn); dn) Vv < Cy (Vf (Zn); dn)v; (3.17) 
which implies (3.6) if the gradient Vf: X — X is monotone, i.e. satisfies 
(Vf(z)-Vf(y),2-y)v 20, x,y EV. (3.18) 


In mechanics, the latter is equivalent to the monotonicity of the stress, 
considered as a function of the strain. 


3.2.4 Strategies for choosing the forcing term 


For inexact Newton-methods, the choice of the forcing term {7,,} in (3.8) 
is crucial for the overall efficiency of the scheme. At iterates {xn} far 
away from the solution, V f and its linear approximation may disagree 
significantly. Thus, solving the linear system (3.9) to a high accuracy may 
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waste computational effort without substantially improving the overall 
convergence behavior (Eisenstat and Walker, 1996). This is commonly 
called oversolving. Setting nn to a moderate constant value, e.g. nn = 0.1 
as suggested by Kelley (2018), can be reasonable but may not be optimal 
for all problems. Eisenstat and Walker (1996) propose more involved 
strategies, taking V f into account. Their first strategy, named choice 1, 
reads 
IV fen) Iv = DV F@n—-1) [bra] + Vfen-ı)|Vv 


a [Veni 2 


with an initial value 9 € [0,1). This choice directly measures the 
disagreement between the gradient and its linear approximation. Thus, 
the value of ņ decreases, as the Newton iterates {£n} approach the 
solution of the system. The alternative choice 2 by Eisenstat and Walker 


(1996) is given by 
_ IV f(tn)llv ) 
ie Poa | p 


with parameters A € [0, 1] and 3 € (1, 2]. The ratio of consecutive residua 


provides a measure for the convergence rate between the current and 
last iteration. Hence, close to the solution, where a faster convergence 
behavior is expected, ņ decreases. Setting the parameter 8 = = 
results in a comparable convergence order for choices 1 and 2. Addition- 
ally, Eisenstat-Walker suggest a safeguard for each choice to prevent a 
premature decrease of ņ, far away from the solution. This is achieved by 
limiting the decrease of n, by a factor of 7,,_; above a certain threshold. 
The safeguard for choice 1 reads 


1+v5)/2 -c _(1+v5)/2 
safe _ max (nant )/ ) ’ if nt N > 0.1, (3.21) 
Nn otherwise, 
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and the safeguard for choice 2 is given by 


max ( aN en Je if À f > 0.1, 
nefe — | n n 1 n 1 (3.22) 


Nn, otherwise. 


Even with the presented forcing term choices and safeguards in place, 
oversolving may occur in the final Newton iteration. Indeed, suppose 
we want to solve (3.3) up to a certain accuracy 


Ivf@)lv < ð, (3.23) 


and the current iterate x„ almost satisfies (3.23). With a small value 
for m, the final Newton iteration may reduce ||V f(x)||v far below the 
desired accuracy 6. To prevent this type of oversolving, the following 
safeguard 
7 = min (max, max (ng, 0.5 4/||V f(@)|Iv)) (3.24) 


with max € [0,1) is suggested in Sec. 6.3 in Kelley’s book (Kelley, 1995). 


3.3 Newton and Quasi-Newton methods in 
FFT-based micromechanics 


3.3.1 Newton’s method 


We consider periodic homogenization problems (Bakhvalov and Panasenko, 


1989) in the context of small-strain continuum mechanics. Let Y be a 
rectangular cell in R? (d = 2,3). The Hilbert space for periodic and 
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mean-free displacement fluctuations is 


HŁ (Y; RÌ) = {ue H' (Y; R? 
L(Y; R°) = {u € HGR) | a 
u periodic, 0,,u anti-periodic on OY, (u)y = 0}, 


where the mean of any integrable scalar or vector valued function q on 


Y is defined by 
1 
(Mr = wf dz, 


together with the inner product induced by the quadratic form 


1 S 
lulli ra = vl IV’ull? dz, 


where V° denotes the symmetrized gradient and the quadratic form 
in the integrand corresponds to the Frobenius inner product on square 
matrices, ||S||? = tr($7'S). 

Furthermore, let a (heterogeneous) strain energy potential 


w:Y x Sym(d)-R, (2,¢) > w(z,e), 


be given, measurable in Y and C? in Sym(d), where Sym(d) denotes 


the linear space of symmetric d x d-matrices. Denote by o = & the 


. . 2 . . . . 
associated stress function, and by Dag its Hessian. For prescribed strain 


=, we seek a minimizer of the function 
H4 (Y; R?) 3 ur f(u) = (w(,E+ V*u))y. (3.26) 


To conform to the framework of the previous section, we compute the 
differential of f 
Df(u) = -div o(-,E+ V*u) 


and its gradient 


Vf (u) = Gdiv o(-,F + Vu) 
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where G is the Green’s operator G = (div V*)~!, which corresponds to 
the negative of the Riesz map on H} (Y; R4). In this context, the equation 
for the n-th Newton increment ¿n € H + (Y; R1), corresponding to (3.9), 
is given by 


.. [8w A . 
Gdiv ga (en) : V°En| = -Gdiv o (En), (3.27) 


where ¢, = E + V*un. For any ag > 0, equation (3.27) is equivalent to 
the Lippmann-Schwinger equation 

= o, [Pw 0|.= 0 

En +I”: ES -C | : En == : o(en), (3.28) 
where C° = ao I, T° = (ao)! V*Gdiv , via the identification =, = V°En. 
Note, if a strain-based iterative scheme is used to solve (3.28), only the 
converged solution &* is compatible, in general, whereas this may be 
false for the iterates {=,,}. This is the case, for instance, for polarization- 
based schemes as the Eyre-Milton method used by Kabel et al. (2014). 
Typically, (3.28) is solved using Krylov-subspace methods, such as CG 
or MINRES (Zeman et al., 2010; Brisard and Dormieux, 2010; 2012), 
due to their excellent performance for linear problems. In addition, 
these schemes operate on compatible strain-fields, permitting a memory 
efficient implementation (Kabel et al., 2014). With these formulae at 
hand, we may formulate a damped Newton scheme, depending on 
Dong’s version of the Wolfe conditions, (3.17) and (3.7). The resulting 
algorithm is summarized in Alg. 1. 
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Algorithm 1 Newton’s method with backtracking by Dong (2010) 
(E, CP, c1,0, c2, maxiter) 


Lee 
2: e + MSiterate (e, g, CP) 
3: repeat 
& Be- (1 41°: [Fs oe c)) :P°:a(e) > Solving (3.28) 
5: pO 
6: y+ +00 
7: a41 
8: k+ 0 
9: while k < maxiter do 
10: ke k+1 
11: ci + crol (c2)*) (c2)* 
12: if (T° : o(€ +a=), =) p2 > cı(T® : o(e), =) 2 then 
13: Vv a 
14: a + 0.5(u + v) 
15: else if (T? : o(e +a), =) z2 < c2(T® : o(e), E)z2 then 
16: ea 
17: a + 2u 
18: else 
19: break 
20: end if 
21: end while 
22: et €+ab 
23: until Convergence > Criterion (3.29) 
24: return £ 
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Newton's method with backtracking by Dong (2010) (continued) 
MSiterate (e, &, C?) 


1 et o(e)-C®:e 

2: € + FFT(e) 

3 e} I": e, e(0)=E 
4: e + FFT! (e) 

5: return £ 


The convergence criterion reads 


[I° : o* ze _ Idiv (la — 5 (3.29) 


“Ty ey 


with a prescribed tolerance 5. This choice was introduced and discussed 
in Schneider et al. (2019). Both, the convergence criterion (3.29) and 
the convergence behavior of the linear Krylov-subspace solver are in- 
dependent of ao, see Zeman et al. (2010). As we start with a single 
iteration of the basic scheme, we use the associated reference material 
ao = (a; + a_)/2 with the extremal eigenvalues a, and a_ of the 
material tangent evaluated over all voxels. For the parameters of the 
line-search procedure, we choose c1, = 1074 and ca = 0.9, see Dong 

(2010). A few remarks on the practical implementation are in order. 

1. The storage requirements for Newton-CG read: 1 current strain, 
and 4 strains for solving the linear system by CG. Furthermore, 
the symmetric material tangent needs to be stored. In 3 spatial 
dimensions, this corresponds to 21 scalar components for every voxel, 
the equivalent of 3.5 strain fields. In total, the storage requirements 
amount to 8.5 strain-like fields. Using the line search procedure by 
Dong (2010) involves storing another strain field, as gradient and 
Newton step have to be kept in memory separately. If affine-linear 
extrapolation is needed, an additional strain needs to be stored. 
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2. We have found out that storing the Hessian in single precision does 
not influence the performance of Newton’s method significantly. In 
contrast, the current strain, and the vectors needed for CG need to be 
stored in double precision to avoid numerical problems (in particular, 
in connection to the FFT). 

3. Similar to the previous comment, the last converged strain can be 
stored in single precision, as it solely serves as the initial condition. 
This remark holds true for other solution methods in FFT-based 
micromechanics, as well. 

4. For finite-difference and finite-element discretizations (Willot, 2015; 
Schneider et al., 2016; 2017; Leuschner and Fritzen, 2018), both the 
conjugate-gradient method and the Newton update can be imple- 
mented on displacement instead of strain (Kabel et al., 2014; Grimm- 
Strele and Kabel, 2019), saving 50% of memory for the corresponding 
fields. 

5. Combining all three previous memory optimizations, only 9 displace- 
ment fields need to be stored. For a microstructure with 512° voxels, 
27 GBs RAM are needed, not taking into account internal variables. 


3.3.2 Anderson acceleration 


The BFGS method as outlined in Sec. 3.2.2 requires the Hessian B, (or its 
inverse) to be kept in memory. Thereby, the algorithm cannot be directly 
applied in the context of FFT-based micromechanics, as the Hessian is 
usually not assembled in this setting due to memory limitations. To 
circumvent this problem, limited-memory Quasi-Newton methods were 
developed, which implicitly update the Hessian by storing a limited 
number m of recent iterates and gradients, with m commonly called the 
depth of the scheme. 

One such algorithm is Anderson acceleration (Anderson, 1965) which 
was recently applied by Shantraj et al. (2015) and Chen et al. (2019a;b) in 
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the context of FFT-based micromechanics. A general discussion of the 
scheme and its implementation is found, e.g., in Walker and Ni (2011) 
or Kelley (2018). Eyert (1996) and Fang and Saad (2009) pointed out 
the relation of Anderson acceleration to Quasi-Newton schemes and 
identified it as a generalized form of Broyden’s second method. Recently, 
Evans et al. (2020) provided a proof that Anderson acceleration improves 
the convergence rate of linearly converging fixed-point methods. 

For an integer depth m > 1, Anderson acceleration requires the last m+ 1 
iterates £x and gradients gj, = T° : o(ex) to be kept in memory, resulting 
in a memory footprint of 2m + 2 strain-like fields. The algorithm is 
outlined in Alg. 2 for the convenience of the reader. Note that for the 
given algorithm Anderson acceleration is applied for every iteration. In 
contrast, Chen et al. (2019a;b) only accelerate every third iteration and 
apply the basic scheme (Moulinec and Suquet, 1998) otherwise. 


Algorithm 2 Anderson acceleration (€, C?) 


1: Eg HE 

2: €, + MSiterate (£o, £, C?) 
3: k40 

4: repeat 
5: ke k+1 

6: mr + min(m, k) 

7: Gk a(Ex) 

8: gk + FFT (gx) 

9: Ik T°: Ik; gk(0) =0 
10: gp + FFT! (gp) 


. ME Mk 
11: (Q0,---;Qm,) + min || I, Oj 9e—-mptilln2 St D529 =1 
Mk 
12: Ek+1 = a 05 (Ek—my+i —Gk—-m,4+i) 
13: Delete €x—m,; Jk-mp 
14: until Convergence > Criterion (3.29) 


15: return €;,41 
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Determining the coefficients a = (ao, . . . , @m, ) by solving the minimiza- 
tion problem 
Mk Mk 
min 5 QjJk-mp+j s.t. > a = (3.30) 
j=0 ee j=0 


is the key step in one iteration of the Anderson acceleration. To solve 
this problem, we reformulate (3.30) in terms of the Lagrangian function 


Mk Mk Mk 


) ) JIQ (Jk-mr +i» Ik-mı+5)L2 +A ) a1] ==> min max 
i=0 j=0 j=0 


(8.31) 


by squaring the objective function and introducing the Lagrangian 
multiplier \. The associated KKT-conditions 


Mk 


5 Qj (Gk—me> Jk-mp +j) L2 +A=0 
j=0 


— (3.32) 
5 Qj (Jk, gk-mp +j) L2 +A=0 
j=0 


Mk 


X oaj-1=0 
j=0 


constitute a system of my + 2 linear equations, which are solved for a 
and à. 


3.3.3 Limited-memory BFGS 


As another limited-memory Quasi-Newton scheme, we propose to apply 
Nocedal’s L-BFGS method, see Sec. 3.2.2, to FFT-based micromechanics. 
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The L-BFGS method can be implemented with a memory-footprint of 
2m + 4 strain-like fields. More precisely, the last m differences of iterates 
Sk = Er+1 — Ek, differences of gradients yx = T°: o(er+ı) - T° : o(ex) 
and inner products pp = 1/(yr, Sk)}z2 have to be kept in memory. In 
addition, the current strain e and gradient T°’ : o(e) and the last strain 
En and gradient I : o(e„) need to be stored. 

For evaluating the L-BFGS increment = = —H?"V f (£n), the two-loop 
recursion of Matthies and Strang (1979) proves useful. A pseudo code is 
given in Alg. 3, where we use the initial Hessian 


H? — (Sn—1, Yn—1) L2 I (3.33) 
g (Yn—1; Yn—1) L? í 
as suggested by Shanno and Puah (1978) and Liu and Nocedal (1989). 
The algorithm takes the current gradient T° : o(ep) as input and over- 
writes it by the increment Zx. 


Algorithm 3 Two-loop recursion for evaluating H?"q for given q 
(Matthies and Strang, 1979; Nocedal, 1980) 
: fork =m—1,m—2,...,0do 
ak 4 Pr(Sk» Q) L? 
q & q — akYk 


1 
2 
3 
4: end for 

5: ge (Sn-1,Un-1)22 
6 

7 

8 

9 


(Yn—1,Yn—1) 22 
: fork =0,1...,m—1do 
bk  Pr(Yr,q)L2 
q +} q + (ar — de) 5% 
: end for 
10: return q 


The L-BFGS method is implemented analogously to Alg. 1, where the 
two-loop recursion replaces the solution of the linear system (3.28) for 
obtaining E. 
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3.3.4 BFGS update of the material tangent 


As an alternative to the limited-memory Quasi-Newton scheme, we 
propose using the BFGS update to approximate the local material tangent 
ow in (3.28) instead of the global Hessian of f in (3.26). In this context, 
the BFGS update reads 


cares _ eures | An 8 Aon 
nore 7 Aon: NEn 


(COS : A en) © (CESS : Aen) 
A En : CBFGS ; A en i 


(3.34) 


where 
AEn = Enti-En and Acon = olen+ı) — o(En). 


We found that the material’s linear elastic stiffness serves as a decent 
initial guess for CBFCS, Consequently, Alg. 1 may be applied with CBFSS 


a (en) in (3.28). Note that, in contrast to the limited-memory 


schemes in Sec. 3.3.2 and Sec. 3.3.3, the linear system (3.28) still needs 


replacing 


to be solved with an iterative solver. In comparison to the Newton-CG 
method, two additional strain-like fields need to be kept in memory to 
compute Aon. 


3.4 Numerical demonstrations 


3.4.1 General setup 


The solution schemes were implemented in Python 2.7. Computationally 
expensive operations such as the application of T and the evaluation 
of the material law were written as Cython extensions and parallelized 
using OpenMP. For the fast Fourier transform, we relied on the FFTW 
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library (Frigo and Johnson, 2005). The computations ran on 6 threads 
on a desktop computer with 32 GB RAM and an Intel i7-8700K CPU 
with 6 cores and a clock rate of 3.7 GHz. An affine-linear extrapolation 
(Moulinec and Suquet, 1998) was used as initial guess for the strain field 
in case of multiple load steps. For the convergence criterion, we use 
(3.29) 
II° : ole _, 

[Ka 
where ag is the scaling factor of the reference material C° = agI. As 
T° = (ao)! V*Gdiv , this convergence criterion is actually independent 
of ag. For this study, we use the reference material of the basic scheme 
ao = (a; + a_)/2. The tolerance is set to 6 = 10~° in Sec. 3.4.2 and 
ô = 1074 in Sec. 3.4.3 and 3.4.4. Throughout, we utilize the staggered 
grid discretization (Schneider et al., 2016). 


3.4.2 Continuous glass-fiber reinforced polyamide 


In the following, we investigate the performance of the L-BFGS method 
and Anderson acceleration as discussed in Sec. 3.3.2 and Sec. 3.3.3 
with respect to the chosen depth m. As microstructure we consider 
a polyamide matrix, reinforced by continuous glass fibers with a volume 
fraction of 15%, and a resolution of 256? pixels, see Fig. 3.1. Using a 
2-dimensional structure enables investigating large values of the depth 
m, without memory becoming a limiting factor. Following Doghri et al. 
(2011), we assume that the mechanical behavior of the polyamide matrix 
is governed by ./2-elastoplasticity, see Sec. 3.3 in Simo and Hughes (1998). 
For the sake of simplicity, the rate-dependent behavior of the material is 
neglected in this approach. A more involved material model, accounting 
for viscoelastic and viscoplastic effects was proposed, e.g., by Krairi et al. 
(2019). The relation between the yield stress ay and the equivalent plastic 
strain p = i 3 \|Zp|| dé is modelled by a linear-exponential hardening 
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equivalent plastic strain 


(b) Equivalent plastic strain at 5% 
uniaxial extension 


(a) Microstructure (2567 pixels) 


Figure 3.1: Continuous glass-fiber reinforced polyamide 


function 


oy (p) = oo + kip + k2(1 — exp(—mp)), 


where oo denotes the initial yield strength, kı denotes the asymptotic 
hardening modulus and ka = oo — ox denotes the difference between 
the initial and saturated yield strength for kı = 0. The prefactor in the 
exponential function is given by m = O/k2, where © denotes the initial 
hardening modulus. The glass fibers are modelled as linear elastic. The 
material parameters according to Doghri et al. (2011) are given in Tab. 3.1. 
We apply mixed boundary conditions (Kabel et al., 2016), corresponding 
to a uniaxial extension of 5% perpendicular to the fiber direction, in a 
single load step. 

The L-BFGS scheme and Anderson acceleration are investigated for 
depths from 1 to 200. In addition, Moulinec-Suquet’s basic scheme 
(Moulinec and Suquet, 1998), the basic scheme with Barzilai-Borwein 
(BB) step-size control (Barzilai and Borwein, 1988; Schneider, 2019a), 
the Newton-CG method and the BFGS-CG method are included as 
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Table 3.1: Glass-fiber reinforced polyamide: Material parameters of fibers and matrix 


Fibers E = 72 GPa, v = 0.22 
Matrix E = 2.1 GPa, vy = 0.3, oy = 29 MPa, 
kı = 139 MPa, kp = 32.7 MPa, m = 319.4 


benchmarks. For the Newton-CG method and the BFGS-CG method, 
we use forcing-term choice 2 of Eisenstat-Walker (3.20), see Sec. 3.4.3. 
The resulting iteration counts and the computational runtimes are given, 
depending on the depth, in Fig. 3.2 and Tab. 3.2. 


-#- Anderson - #- LBFGS 
1,000 40 ry 
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800 Fi | 30 b i 
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E \ 2 20b P 
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Figure 3.2: Continuous glass-fiber reinforced polyamide: Iteration count (left) and 
computation time (right) with respect to the chosen depth 


For Anderson acceleration, we observe that the required number of 
iterations drops significantly up to a depth of 5 and stagnates for depths 
larger than 50. Between the minimum depth of 1 and a depth of 200, i.e., 
keeping all iterates in memory, the iteration count decreases by 85%. In 
contrast, the convergence behavior of L-BFGS is much less affected by 
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the chosen depth. From the onset, it requires much fewer iterations than 
Anderson acceleration and exhibits a faster convergence behavior up to 
depths of 20. For depths larger than 5, the iteration counts of L-BFGS 
remain approximately constant with a decrease of about 20% compared 
to a depth of 1. 

Considering the overall computational effort, depths around 2 to 5 
appear to be optimal for both schemes. Taking more iterates into account 
increases the computational effort for each iteration, which offsets a 
further decrease in iteration counts. For this range of depths, L-BFGS 
and Anderson acceleration have memory footprints of 8 — 14 and 6 — 12 
strain fields, respectively, compared to 8.5 for the Newton-CG method, 
10.5 for the BFGS-CG method, 2 for the Barzilai-Borwein scheme and 1 
for the basic scheme. 

With the optimal depth choice, L-BFGS is the faster of the two limited- 
memory schemes. However, it performs worse than the (Quasi-)Newton- 
Krylov methods and the Barzilai-Borwein scheme which exhibit similar 
runtimes. Even though L-BFGS converges in fewer iterations than 
the Barzilai-Borwein method, it is slower overall, due to the higher 
computational cost per iteration. In particular, the parallelization of the 
inner products in the two-loop recursion of Alg. 3 was not effective, 
introducing a significant overhead, see Chen et al. (2014). The basic 
scheme is the slowest of the investigated solvers, taking about an order 
of magnitude longer to converge. Whereas its computational cost per 
iteration is similar to the Barzilai-Borwein scheme, the required iteration 
count is significantly higher, due to the pronounced material contrast of 
the composite during plastification. In conclusion, we observe that the 
Barzilai-Borwein scheme outclasses the investigated limited-memory 
methods both in performance and memory footprint. Therefore we do 
not include the latter algorithms in the remaining numerical examples. 
The performance comparison of the remaining algorithms is expanded 
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in Sec. 3.4.3 and Sec. 3.4.4 for more complex microstructures and material 
laws, respectively. 


Table 3.2: Continuous glass-fiber reinforced polyamide: Iteration counts and computa- 
tional runtime with respect to the depth used in the algorithm 


Depth Iter. count Comp. time in s 

Anderson acc. 1 915 8.7 
2 426 4.2 
5 251 3.2 
10 306 5.5 
20 213 5.5 
50 140 6.5 
100 139 9.5 
200 139 10.4 

L-BFGS 1 214 2.4 
2 184 2.3 
5 171 2.8 
10 171 3.7 
20 166 5.4 
50 166 10.9 
100 166 19.9 
200 170 38.4 

Newton CG - SANON 1.6 

233 (CG) 
BFGS CG : PINON) 5 
281 (CG) 
BB - 229 1.7 
Basic scheme - 3897 27.2 
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3.4.3 Porous short glass-fiber reinforced polyamide 


v. Mises equivalent strain 


(b) Von Mises equivalent strain at 1% uniaxial 


i 3 
(a) Microstriicture [BB voxels) extension (J2-elastoplasticity) 


Figure 3.3: Porous glass-fiber-reinforced polyamide 


We consider a porous polyamide matrix with short glass-fiber reinforce- 
ments, see Fig. 3.3, which is resolved by 256 voxels. The glass fibers 
are unidirectionally aligned in x-direction with a volume fraction of 
15%. The volume fraction of the pores is 1%. The material models and 
parameters correspond to those in Section 3.4.2, see Tab. 3.1. The given 
example constitutes a challenging non-linear test problem for the inves- 
tigated micromechanical solvers. Due to the high stiffness of the glass 
fibers in comparison to the softer polymer matrix, the material contrast 
between the two phases is large. During plastification, the contrast 
increases even further as the minimum eigenvalue of the polyamides 
tangential stiffness approaches 0, owing to the exponential hardening 
law. In combination with the unidirectional short fiber structure, this 
results in strong localization of the strain fields around the fibers, see 
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Fig. 3.3. Last but not least, due to the presence of pores, the material 
contrast of the overall microstructure is infinite. 
First, we investigate the different forcing term choices from Sec. 3.2.4 in 
the FFT-based setting to identify a suitable general-purpose strategy for 
the Newton-CG and BFGS-CG method. Next, we compare the perfor- 
mance of the solvers with the given forcing term choice for studying the 
material behavior under uniaxial extension. 
Influence of the forcing term on convergence and runtime. In their 
study on forcing term strategies, Eisenstat and Walker (1996) considered 
numerical examples with up to 10* degrees of freedom. In the context of 
FFT-based micromechanics, much larger problem sizes are commonly 
considered, as it takes high voxel counts to finely discretize complex 
microstructures. Thus, we are interested whether the results of Eisenstat- 
Walker carry over to the FFT-based setting for our current example with 
6 x 256° ~ 10° degrees of freedom. Furthermore, we investigate how 
the BFGS-CG scheme is affected by the different forcing term strategies 
in comparison to the Newton-CG scheme. The following choices are 
considered: 
1. Choice 1 corresponds to the first adaptive strategy of Eisenstat and 
Walker (1996) (3.19) 


1 
h E pe i En 2 
= TEU oleae N rend 
Hw _ 
_ | (147° : sare) — cf) A } T? : o(En-ı) m ’ 
(3.35) 


with the associated safequard (3.21) and Kelley’s safeguard against 
oversolving (3.24) in place. For this choice, the forcing term is pro- 
portional to the disagreement between the gradient and its linear 
approximation. Thus, nn decreases in the vicinity of the solution, and 
the linear system is solved with increasing accuracy. We start with a 
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high value, i.e., low accuracy, of no = max = 0.75, which also serves 
as the upper bound for the forcing term. 

2. Choice 2 corresponds to the second forcing-term strategy (3.20) by 
Eisenstat and Walker (1996) 


ate es. NE 
m =a ( steel)”, ae 


PO: elena) Ize 


with safeguards (3.22) and (3.24) preventing oversolving. Like choice 
1, this represents an adaptive strategy. In this case, the ratio of recent 
residuals serves as a measure of the convergence rate. The latter is 
expected to decrease close to the solution, leading to smaller values 
of mn. For the algorithmic parameters, we chose \ = 1 and 6 = 14V5, 
resulting in a convergence behavior similar to choice 1. The initial 
value and upper bound for the forcing term are set to no = Nmax = 


0.75. 

3. Choice 3 is given by n„ = 0.1, i.e., the forcing term is set to a constant 
value, corresponding to a modest accuracy for solving the linear 
system. Kelley (2018) suggests this choice as a simple forcing-term 
strategy which works well in practice. 

4. Choice 4 sets the forcing term to a low constant value of n, = 5 x 1075, 
corresponding to a high accuracy. The accuracy is chosen so that the 
Newton-CG scheme converges in one step for the linear elastic case. 

The boundary conditions for the problem correspond to uniaxial exten- 

sion up to 1% tensile strain in fiber direction, parallel to the x-axis. The 

load is applied in a single step. 

Two scenarios are considered. In the first case, the polyamide matrix is 

assumed to behave in a purely elastic way, resulting in a linear problem. 

For this example, the Newton-CG scheme and the BFGS-CG scheme are 

equivalent. In particular, this allows us to investigate the characteristic 

convergence behavior of the adaptive forcing-term choices 1 and 2 and 
the modest accuracy choice 3. Furthermore, we are interested how the 
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computational runtimes of choices 1 to 3 compare to that of choice 4, 
which is expected to converge in a single Newton step. 

In the second case, the matrix behavior is governed by J2-elastoplasticity, 
constituting a non-linear problem. For the Newton-CG scheme, we 
compare the convergence behavior of the high accuracy choice 4 to 
the other options and evaluate whether quadratic convergence can be 
reached. Furthermore, we discuss how the convergence behavior for 
the different strategies changes when the approximated tangent stiffness 
of the BFGS-CG scheme is used. We conclude the investigation by 
evaluating the computational performance of the forcing term choices for 
both solvers and evaluate whether a strategy of choice can be identified. 


-#-- forcing term choice 1 -æ - forcing term choice 2 -#- forcing term choice 3 
-+- forcing term choice 4 —— tolerance 
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(a) Linear elastic matrix 
behavior: Newton-CG solver 
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(b) J2-elastoplastic matrix behavior: Newton-CG solver (left) 


and BFGS-CG solver (right) 


Figure 3.4: Porous glass-fiber reinforced polyamide: Residual vs. number of Newton 
iterations 


To evaluate the impact of the different forcing term choices, the residual 
is plotted as a function of the number of Newton iterations in Fig. 3.4, 
and as a function of the computation time in Fig. 3.5. The final iteration 
counts and computation times are listed in Tab. 3.3. 


First, we take a look at the linear elastic case. As expected, the Newton 
scheme converges in a single step for the high accuracy choice 4. Choice 
3 requires 5 iterations and converges at a linear rate. For choice 1 and 
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Figure 3.5: Porous glass-fiber reinforced polyamide: Residual vs. computation time 


2, the convergence behavior is similar. Both start with a low accuracy 
and a comparatively slow convergence rate. As the residual becomes 
smaller, the value of 7, decreases and the linear system is solved to 
higher accuracy. Consequently, the convergence rate increases for the 
last iterations. For the linear elastic case, we observe that the overall 
number of iterations, i.e., the sum of CG and Newton iterations, is similar 
for all forcing term strategies, see Tab. 3.3. The computational effort of 
solving the linear system to high accuracy is comparable to taking a 
larger number of Newton steps with modest accuracy. Hence, despite 
the differences in Newton iteration counts, the different forcing-term 
choices exhibit similar computation times, see Fig. 3.5. Notably, choice 
4 is not the fastest even, though it led to convergence in a single step. 
The remaining difference in runtimes between the choices is explained 
by the wasted computational effort of solving to a smaller residual than 
required. Fortuitously, the final residual for choice 3 is the closest to the 
chosen tolerance, leading to the lowest computation time. 

Next, we consider the non-linear case solved by the Newton-CG scheme. 
For choices 1 to 3, the convergence behavior is similar to the linear elastic 
case. Choice 4, however, requires 5 iterations and does not converge 
much faster than choice 3, even though a much higher accuracy is 


66 


3.4 Numerical demonstrations 


used. Note that for the current example, the Newton-CG scheme with 
forcing term choice 4 does not exhibit a quadratic convergence rate 
within the chosen tolerance. For a preliminary computation on the small 
microstructure of Sec. 3.4.2, we could confirm a quadratic convergence 
rate for the Newton-CG method using very low tolerances 6 = 1078 
and 7, = 107° and thereby validate our implementation. However, the 
computational effort wasted by oversolving was even more excessive for 
such a setup. With respect to computation time, choice 1 and 2 are the 
fastest for the current example, converging after just over 300 seconds. 
Choice 3 takes roughly 30% longer. Taking a look at the overall runtime 
of choice 4 reveals the computational cost of oversolving. For this 
example, the advantage of Kelley’s safeguard (3.24) becomes apparent. 
For all forcing-term strategies, we arrive at a residual slightly above the 
desired accuracy in the second to last iteration. For the adaptive choices 
1 and 2, safeguard (3.24) is active and, consequently, the linear system 
is solved to low accuracy in short time. In case of the constant choices 
3 and 4, where the safeguard is not used, we arrive at residuals much 
smaller than the desired accuracy, wasting computational effort. 

To conclude the investigation, we take a look at the BFGS-CG scheme. 
For this solver, choices 3 and 4 lead to roughly the same linear rate of 
convergence. After few initial steps with a low accuracy, an identical 
convergence rate is approached for choices 1 and 2, as well. Apparently, 
higher accuracy than for choice 3 does not improve the convergence rate 
for the BFGS tangent approximation (3.34). With respect to the overall 
runtime, choices 1 and 2 are fastest, with choice 3 being only marginally 
slower. Choice 4 is the slowest option by far. 

To summarize, we observe that for non-linear material behavior, the 
forcing term choices 1 and 2 by Eisenstat-Walker lead to the shortest 
runtime. However, choice 3 with a constant forcing term of nn = 0.1 is 
not much slower and serves as an easy-to-implement alternative. Based 
on the performance of choice 4, we come to the same conclusion as Knoll 
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Table 3.3: Porous glass-fiber reinforced polyamide: Iteration counts and computation times 
for different forcing term choices 


Newton-CG: Linear elastic matrix 
Choice 1 Choice2 Choice3 Choice 4 


Comp. time ins 264.0 281.8 219.6 246.3 
Newton iter. count 7 8 5 2 
CG iter. count 126 132 107 119 


Newton-CG: Matrix governed by Ja plasticity 
Choice 1 Choice2 Choice3 Choice 4 
Comp. time ins 321.0 306.1 391.2 1524.0 
Newton iter. count 8 8 6 5 
CG iter. count 154 147 193 770 


BFGS-CG: Matrix governed by Js plasticity 
Choice 1 Choice2 Choice3 Choice 4 
Comp. time ins 389.7 373.5 434.3 2109.9 
Newton iter. count 9 9 7 7 
CG iter. count 179 174 207 1053 


and Keyes (2004): Aiming for a high (possibly quadratic) convergence 
rate by solving the linear system to high accuracy is inefficient with 
respect to the overall runtime of the scheme. These conclusions hold both 
for Newton-CG and BFGS-CG. Comparing the two solution schemes, 
we find that for the fastest forcing-term choice 2 the BFGS-CG scheme is 
only about 22% slower than the Newton-CG method, even though we 
applied a large non-linear load step. For the material laws considered 
in this example, we conclude that the BFGS update leads to a decent 
approximation of the tangent stiffness in a limited number of iterations. 
Discussion of the effective elastoplastic material properties. From a 
material-science viewpoint, the effective elastoplastic behavior of the 
composite material is of interest. In particular, this includes character- 
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izing the anisotropy of the stress-strain relation in the elastic regime 
and the shape of the yield-boundary. To this end, we simulate uniaxial 
tensile tests in various directions relative to the fiber direction, i.e., the 
x-axis. To be specific, the loading is applied at 0°, 15°, 45° and 90° 
relative to the x-axis in the xz- and xy-plane and at 0°, 45° and 90° 
relative to the y-axis in the yz-plane. The tensile tests are performed 
up to 5% strain in load direction and subdivided into 50 load steps to 
obtain finely resolved stress-strain curves. This gives us the opportunity 
to evaluate the performance of the investigated solvers for a relevant 
practical application. 

This paragraph focuses on the characterization of the material behavior, 
based on the results of the simulations. The convergence behavior and 
runtimes of the solution schemes are subsequently discussed in Sec. 3.4.3. 
The linear elastic behavior of the composite is characterized by the 
effective stiffness tensor C relating effective stress 5 = (c),- and effective 
strain € = (e), by Hooke’s law 


a=C:é. (3.37) 


Using the elastic parameters in Tab. 3.1, the effective stiffness of the 
composite material, given in Voigt’s notation, reads 


10.1 142 141 0.01 0.0 0.01 
1.42 3.49 1.45 0.03 0.0 0.0 
141 1.45 3.48 0.02 0.0 0.0 
0.01 0.03 0.02 1.04 0.0 0.0 
0.0 00 00 0.0 1.11 0.02 
0.01 0.0 0.0 0.0 0.02 1.11 


A 
II 


GPa, 


up to 3 significant digits, and was identified through 6 linear elastic 
computations. C may be well approximated by a transversely isotropic 
stiffness tensor with engineering constants EL = 9.29 GPa, Er = 2.81 
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GPa, vrr = 0.38, vir = 0.29 and Gir = 1.11 GPa, with a relative error 


below 1%. As a measure of the elastic anisotropy, we consider Ciiso 
defined as 


Caniso 


- ; , a ie 
=C-C” with CC® = (C :: P,)P, + 5(C : P2)P2, (3.38) 
where P; and P denote the projectors onto the spherical and deviatoric 
d x d matrices, respectively. The symbol :: denotes the quadruple tensor 
contraction, i.e., a = B :: C is equivalent to a = B;;xıCjjrı in index- 
notation, using the summation convention. For the given material, 


\|C2"0|| /\|C|| = 47% in Frobenian norm, i.e., the elastic anisotropy is 
strong for this case. 


-#- rz-plane - #- ry-plane -+ - yz-plane 
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(a) Stress-strain curves for varying load angles 


(b) Offset yield strength Rpo.2% at varying load 
in the xz-plane 


angles in the xz-, xy- and yz-plane 


Figure 3.6: Elastoplastic behavior of the porous glass-fiber reinforced polyamide. The load 
angles are measured relative to the x-axis (fiber direction) in the xz and xy-plane and 
relative to the y-axis in the yz-plane 


The stress-strain curves for the simulated uniaxial tensile tests in the 
xz-plane are shown in Fig. 3.6a. We observe that, up to an angle of 45°, 
the stiffness decreases and the onset of plastic behavior shifts to lower 
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stresses and higher strains. Between 45° and 90° offset of fiber to load 
direction, the observed behavior stays roughly identical. A common 
measure to quantify the onset of plasticity is the offset yield point Ryo.2%, 
as the actual yield stress is difficult to determine for smooth stress-strain 
diagrams. The offset yield point R,o.2% is defined as the stress where the 
component of the effective plastic strain &p = €—C7! : g in load direction 
reaches 0.2%. The results with respect to the load angle are shown in 
Fig. 3.6b. Due to the isotropic behavior in the yz-plane perpendicular to 
the fiber direction, as well as the similarity of the curves in the zz- and 
xy-plane, the boundary of the effective yield surface is approximately 
transversely isotropic. The yield strength in fiber direction is highest and 
decreases in a roughly linear way up to a relative angle of 45°. Between 
45° and 90°, it stays approximately constant. Even though the yield 
strength perpendicular to the fiber direction is a factor 2.5 lower than in 
fiber direction, it is still 1.6 times higher than for the unreinforced matrix 
material, see Tab. 3.1. 

Performance comparison for uniaxial extension. Due to the trans- 
versely isotropic material behavior, we restrict the performance compari- 
son of the solution schemes to the computations in the xz-plane. Fig. 3.7 
shows the computation time, the total number of iterations and the 
number of gradient evaluations for each load step. For the Newton-CG 
and BFGS-CG solvers, the total number of iterates denotes the sum of CG 
and outer iterations, whereas only the latter are counted for the number 
of gradient evaluations. For the basic scheme and the Barzilai-Borwein 
scheme, the gradient is evaluated in each iteration, leading to identical 
counts for both values. 

Qualitatively, the resulting plots for the computations at varying load 
angles are roughly similar. As the affine-linear extrapolation takes effect, 
the iteration counts and runtimes significantly decrease from the first to 
the second iteration. For the computations with relative load angles of 
45° and 90°, the second load step is still linear elastic and the solution 
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solution schemes for uniaxial extension at various load angles relative to the x-direction in 
the xz-plane 
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Table 3.4: Porous glass-fiber reinforced polyamide: Mean computation times and iteration 
counts for uniaxial extension at various load angles in the xz-plane 


Newton-CG Mean Newton iter. 4.1 4.3 4.3 4.4 
Mean GG iter. 28.9 317 27.6 31.3 
Mean comp. timeins 723 76.0 693 76.1 
BFGS-CG Mean Newton iter. 4.1 4.2 4.3 4.3 


Mean CG iter. 28.0 30.1 28.0 30.5 

Mean comp. timeins 723 781 73.2 78.3 
BB Mean iter. 272 287 27.0 274 

Mean comp. timeins 55.6 59.3 564 57.4 
Basic scheme Mean iter. 199.8 236.6 201.4 210.9 


Mean comp. timeins 341.1 411.6 347.7 367.8 


schemes converge within a single iteration. Subsequently, the iteration 
counts increase at the onset of plastification and decrease again after the 
material is fully plastified. Taking a closer look at the BFGS-CG method, 
we notice that its performance closely matches that of the Newton-CG 
method. This observation holds for both the overall performance, see 
Tab. 3.9, as well as for the iteration count and runtime within each load 
step, see Fig. 3.7. The tangent stiffness tensor for J2-elastoplasticity is 
merely a rank-one update of the elastic stiffness tensor, see Sec. 3.3.2 in 
Simo and Hughes (1998). As the BFGS-CG method is initialized with the 
elastic stiffness, the analytic tangent is well-approximated within a few 
BFGS-updates. 

Evaluating the material law of J2-elastoplasticity is comparatively cheap, 
see Simo and Hughes (1998). More precisely, the computation time spent 
on evaluating £ ++ o(e) for all voxels is roughly of the same order of 
magnitude as the computation time for the application of T° and the 
associated FFTs for typical cell sizes and resolutions. Usually, these are 
the most expensive steps in an FFT-based solution algorithm. In Tab. 3.5, 
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Table 3.5: Porous glass-fiber reinforced polyamide: Computation time per application of 
the most expensive operations for loading in x-direction and solved by Newton-CG 


Mean comp. time 
per application in ms 


Material law 653.0 
Tangent 315.9 
FFT 893.7 
re operator 147.9 


the average computation time per application of these operations is 
given for the 0° load case solved by the Newton-CG method. For the 
given problem, we observe that evaluating the material law is slightly 
faster than applying forward and backward FFT, and about twice as 


expensive as applying the tangent 3 +> ow (en) : E, i.e., a linear elastic 


material. The results for the other load cases and solution schemes are 
roughly similar. Note that the tangent operator is only applied when 
using the Newton-CG and BFGS-CG method. As a consequence, the 
computational cost of a gradient evaluation is similar to a CG iteration 
and the runtimes of all solvers are roughly proportional to their total 
iteration count, see Fig. 3.7. Thus, even though the Newton-CG and 
BFGS-CG method require much less evaluations of the material law, the 
Barzilai-Borwein scheme converges faster. The basic scheme is slower 
than the other investigated algorithms by a factor of 5 — 8. Due to 
the affine-linear extrapolation, the difference in performance is not as 
pronounced as for our previous example in Sec. 3.4.2. 


3.4.4 Directionally solidified NiAl-Cr(Mo) alloy 
Due to its high melting point and corrosion-resistance, nickel-aluminum- 


chrome eutectics with minor additions of molybdenum, i.e. NiAl-Cr(Mo) 
alloys, are a promising class of structural high temperature materials. 
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The material behavior of the components in this alloy is governed by 
single-crystal elasto-viscoplasticity. Compared to the material laws of 
Sec. 3.4.3, i.e. linear elasticity and J2-elastoplasticity, evaluating the 
material law of a single-crystal elasto-viscoplasticity model is consid- 
erably more expensive and tends to dominate the overall computation 
time (Eghtesad et al., 2018a). Thus, NiAl-Cr(Mo) alloys represent a 
valuable benchmark for the investigated solution schemes. It is expected 
that the number of required gradient evaluations is more indicative 
of the overall performance in this case. This fact favors the use of 
(Quasi-)Newton-Krylov methods, as the solution of the linear system is 
less relevant for the runtime. 

After a directional solidification process, NiAl-Cr(Mo) develops a cellular 
structure with NiAl and Cr(Mo) lamellae parallel to the growth direction 
(Cline and Walter, 1970). Similar microstructures are observed for other 
intermetallics, e.g. titanium-aluminides (Huang and Hall, 1991) or 
iron-aluminides (Scherf et al., 2016; Schmitt et al., 2017). To investigate 
mechanical behavior of a lamellar NiAl-Cr(Mo) alloy, a cellular mi- 
crostructure with 512 grains was generated using the Voronoi tessellation 
routine of the software Neper (Quey et al., 2011). Based on findings 
by Whittenberger et al. (2001) and Raj and Locci (2001) for moderate 
solidification rates, an aspect ratio of 4 along the growth direction 
parallel to the y-axis was chosen for the grains. The microstructure 
is shown in Fig. 3.8, resolved by 64° voxels. 

Notice that we do not resolve the lamellar structure for each grain as this 
would require an excessively high voxel count. Instead, we homogenize 
a two-phase laminate for each voxel using the algorithm presented in 
Kabel et al. (2017). The orientation of the grains was chosen so that the 
normal direction of the laminate interface is uniformly distributed in the 
xz-plane, i.e., perpendicular to the growth direction. Cline and Walter 
(1970) investigated the crystallographic relationship in the laminate and 
showed that all planes and directions of NiAl and Cr(Mo) are parallel. 
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Figure 3.8: Directionally solidified NiAl-Cr(Mo) 


The laminate interface is parallel to the (112) plane and the growth 
direction is parallel to the (111) direction. 
For the two phases of the laminate, the material behavior is governed 
by a single-crystal elasto-viscoplastic model. The infinitesimal strain is 
additively decomposed 

E = Ee + Ep (3.39) 


into elastic €e and plastic £p parts. The stress-strain relationship follows 
Hooke’s law 
g=C:e,=C: (e-€) (3.40) 


for the elastic strains. For single-crystal elasto-viscoplasticity, the plastic 
strain is composed of simple shear deformations of the individual 
crystallographic slip systems. The evolution of the plastic strain is 
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governed by (Bishop, 1953) 


N 
êp = X jada B° na; (3.41) 

a=1 
where Ja, da and na denote the slip rate, slip direction and slip plane 
normal for the ath of N slip systems, respectively. For the flow rule of 
the slip rate, we chose the power-law formulation of Hutchinson (1976) 


Jo = josgn(ra) |G], with ra=0:(da@'n) (842 


and reference slip-rate yo, yield stress T" and stress exponent m. For 


the reinforcing Cr(Mo) phase, the yield stress 7" 


Albiez et al. (2016a) 


2 
F Too : 1 po 
= th = 1- -<k 1-,/— 
ar VP a| sr zar) ( Ye)! 


(3.43) 


is modeled following 


and maximum yield stress Tə, characteristic length d, recovery constant 
ka, dislocation density p with its initial value po and its saturation value 
ps. NiAl is assumed to behave perfectly plastic, ie. T” = r. The 
material parameters and volume fractions for NiAl-31Cr-3Mo are taken 
from Albiez et al. (2016b), see Tab. 3.6. 

Note that the single-crystal plasticity model with Hutchinson’s flow rule 
is not a generalized standard material (Steinmann and Stein, 1996) and 
has a non-symmetric tangent stiffness. As the tangent stiffness of the 
phases enters the homogenized tangent stiffness of the laminate, see 
Gliige and Kalisch (2014), this would usually prohibit using the CG 
method for solving (3.28). However, we found in Sec. 4.6 that using the 
Newton-CG method and only the considering the symmetric part of the 
tangent stiffness yielded decent results. Hence, we use the symmetrized 
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Table 3.6: Directionally solidified NiAl-Cr(Mo): Material parameters of Cr(Mo) lamellae 
and NiAl matrix (Albiez et al., 2016b) 


Cr(Mo) NiAl 
Volume fraction cyia = 0.54 CCr(Mo) = 0.46 
Elastic moduli Ci, = 350.0 GPa Ci, = 182 GPa 
Cie = 67.8 GPa Cie = 120 GPa 
C44 = 100.8 GPa C44 = 85.4 GPa 
Flow rule Yo =0.4s! Yo = 1073 s71 
n = 4.6 n = 5.75 
Hardening Too = 3256.7 MPa TË = 37.25 MPa 
d = 0.409 um 


po = 10° mm? 


ps = 2.9 x 107” mm? 


kg = 13 

Slip systems {110}(111) {001} (100) 
{112} (111) {011}(100) 
{123}(111) {011}(110) 


tangent stiffness of the single phases for the solution of the laminate and 
the computation of its tangent. 

Discussion of the effective creep behavior. For high-temperature struc- 
tural materials, the creep behavior, i.e., the deformation of the material 
subjected to a constant stress load, is an important mechanical character- 
istic. To investigate the anisotropic creep behavior of the NiAl-Cr(Mo) 
microstructure, we simulate creep tests in various directions relative to 
the growth direction of the material, i.e., the y-axis. More specifically, 
we apply boundary conditions corresponding to uniaxial compression 
with a magnitude of 200 MPa at 0°, 15°, 45° and 90° relative to the y-axis 
in the yz- and xy-plane and at 0°, 45° and 90° relative to the x-axis in 
the xz-plane. The load is applied in 1 second and a single load step 
and, afterwards held constant for 50 load steps for a specified creep time. 
The creep times for each angle are listed in Tab. 3.7 and were chosen to 
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obtain a fine resolution of the creep rate in time. Note that, due to the 
prescribed softening behavior (3.43), an excessively coarse resolution of 
the load steps over time leads to divergence of the solution schemes for 
this material. Simulating such a creep loading is a challenging problem 
for the investigated solution schemes, as a load transfer from the softer 
NiAl to the more creep resistant Cr(Mo) occurs as a viscous effect after 
the initial loading, see Albiez et al. (2016a;b). Thus, the loading in the 
single phases is non-monotone, especially in the first few load steps after 
the initial loading. 


0° 15° 45° 90° 
yz-plane 10000s 2000s 100s 100s 
xy-plane 10000s 2000s 100s 100s 
xz-plane 100s = 100s 100s 


Table 3.7: Directionally solidified NiAl-Cr(Mo): Creep times with respect to load angle for 
all simulated creep experiments 


In the following, we discuss the creep behavior observed in the simu- 
lations. The performance of the solution schemes for this example is 
compared in Sec. 3.4.4. For the characterization of the creep behavior, 
the creep rate £°, i.e., the strain component in load direction measured 
after the initial loading, and its minimum value ¿f in» are of interest. 

In Fig. 3.9a, the creep curves for the simulations in the yz-plane are 
shown. The curve for the load in growth direction agrees well with the 
computational and experimental results reported by Albiez et al. (2016b). 
Up to a load angle of 45°, we observe an increase in the overall creep 
rate and a less pronounced softening behavior, i.e., an increase of the 
creep rate at increasing strains. This signifies that, in case of aligned 
load and growth direction, a large amount of stress is carried by the 
creep resistant Cr(Mo) lamellae which in turn activates their softening 
behavior. Fig. 3.9b shows the minimum creep rate for all computations 
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Figure 3.9: Effective creep behavior of directionally solidified NiAl-Cr(Mo) at different 
load angles for an applied load of 200 MPa. The load angles are given with respect to 
the y-axis (growth direction) in the yz- and xy-plane and with respect to the x-axis in the 
xz-plane 


as a function of the load angle. The good agreement of the results in 
the yz- and xy-plane as well as the approximately isotropic behavior in 
the xz-plane indicate a transversely isotropic effective creep behavior 
for NiAl-Cr(Mo). We observe that with increasing angle relative to the 
growth direction the logarithm of the minimum creep rate increases 
linearly up to an angle of 45° and subsequently stagnates. The difference 
between the highest and lowest value for ¿fin is slightly over two 
orders of magnitude. This represents an improvement in robustness 
compared to the similar directionally solidified molybdenum-reinforced 
nickel-aluminum alloys (NiAl-Mo) which form unidirectionally aligned 
fiber structures instead of laminates. For NiAl-Mo, FFI-based compu- 
tations predicted a decrease in creep strength by roughly 4 orders of 
magnitude down to the level of pure NiAl in case of off-axis loading, see 
Sec. 4.6.3. Similarly, Seemiiller et al. (2013) experimentally observed a 
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considerable increase in creep rate for NiAl-Mo with a high content of 
misaligned fibers. Thus, we conclude that the cellular laminate structure 
of NiAl-Cr(Mo) leads to a weaker anisotropy and a larger robustness 
against misaligned loading compared to fibrous materials with a similar 
composition. 

Performance comparison for creep loading. In analogy to Sec. 3.4.3, 
we take a closer look at the runtimes, total iteration counts and gradient 
evaluations of the solvers for each load step, see Fig 3.10. Due to the 
material’s transversely isotropic behavior, we restrict the discussion to 
the computations in the yz-plane. During the first few load steps of the 
creep computations, we observe high iteration counts and runtimes, due 
to the initial load application and the subsequent load transfer. This 
behavior is less pronounced for the case where growth direction and 
loading direction are parallel. As the normal direction of the laminates 
are distributed in the xz-plane, all laminate planes are parallel to the 
y-direction. Thus, the resultant fields are less heterogeneous for a loading 
in this direction, leading to lower computational costs. As the fields 
stabilize and the affine-linear extrapolation takes effect, computation 
times and the required number of material evaluations decrease to a 
lower level, roughly between load step 5 and 15. For the 0° load angle 
and 15° load angle computations, the computation time per material 
evaluation increases with the creep time, due to the softening of the 
material. In the former case, the required number of iterations increases 
as well, as the softening is more pronounced and leads to a higher 
internal material contrast. 

In contrast to our previous example in Sec. 3.4.3, solving the two-phase 
laminate and evaluating the single-crystal elasto-viscoplastic material 
laws dominates the overall computation time, see Tab. 3.8. This holds 
true for all solvers and load cases. Hence, we observe that the runtime is 
approximately proportional to the number of gradient evaluations. 
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Figure 3.10: Directionally solidified NiAl-Cr(Mo): Performance comparison of the solution 
schemes for creep loading at various load angles relative to the y-direction in the yz-plane 
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Table 3.8: Directionally solidified NiAl-Cr(Mo): Computation time per application of the 
most expensive operations for the case of loading in y-direction solved by the Newton-CG 
method 


Mean comp. time 
per application in ms 


Material law 9944.0 
Tangent 5.0 
FFT 6.7 
T°? operator 2.4 


We take a closer look at the convergence behavior of the BFGS-CG 
method. Roughly up to the 5th load step, the BFGS-CG method requires 
a higher number of Newton iterations than the Newton-CG method. In 
comparison to the example in Sec. 3.4.3, it takes more BFGS update 
iterations to achieve a good approximation of the tangent stiffness. 
Firstly, this can be traced back to the difference in loading. Whereas 
the first load steps of the uniaxial extension in Sec. 3.4.3 were in the 
linear elastic regime, the creep loading is rapidly applied in the first load 
step, immediately leading to non-linear material behavior. Secondly, 
the tangent stiffness for the single-crystalline phases and the resulting 
homogenized tangent stiffness of the laminate is more complex than 
the one of J2-elastoplasticity. Thus, with the linear elastic stiffness as 
starting point, more BFGS updates are necessary to approximate the 
material’s tangent stiffness. After the slower initial load steps, BFGS-CG 
and Newton-CG exhibit similar runtimes and Newton iteration counts. 
In fact, the BFGS-CG method even converges in slightly fewer Newton 
iterations than the Newton-CG method for some load steps. This may 
be due to a combination of two factors. Firstly, we use the symmetrized 
tangent of the single-phases to compute the tangent of the laminate. 
Secondly, we do not achieve the highest possible convergence rate for 
Newton-CG, by using the forcing term choice 2, see Sec. 3.4.3. We 
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Table 3.9: Directionally solidified NiAl-Cr(Mo): Mean computation times and iteration 
counts for creep loading at various angles in the yz-plane 


Newton-CG Mean Newton iter. 4.8 5.4 5.1 4.8 
Mean CG Iter. 6.8 12.3 10.9 11.2 
Mean comp. time ins 68.4 114.2 78.5 73.8 
BFGS-CG Mean Newton iter. 4.9 4.9 4.9 4.3 


Mean GG Iter. 35.1 38.2 33.2 32.8 

Mean comp. timeins 68.9 1072 77.6 69.4 
BB Mean iter. 9.1 15.7 149 14.0 

Mean comp. time ins 100.6 258.7 166.4 161.2 
Basic scheme Mean iter. 22.7 55.5 46.0 57.8 


Mean comp. time ins 259.6 939.8 611.7 793.2 


further note that the BFGS-CG method requires more CG iterations 
than Newton-CG, see Tab. 3.9. This indicates that the BFGS tangent 
approximation exhibits a higher internal material contrast than the 
analytic tangent for this example. Comparing the mean computation 
times per load step, we see that this does not negatively impact the 
method’s overall performance. In conclusion, BFGS-CG and Newton- 
CG exhibit very similar computation times with BFGS-CG being even 
slightly faster for the 15° to 90° load angle computations. 

For the Barzilai-Borwein method, we note that the total number of 
iterations is similar to the Newton-CG method for all computations. 
However, as the material law is evaluated for every iteration of the 
Barzilai-Borwein scheme, the resulting computation times are 1.5 to 2.5 
times higher than for Newton-CG and BFGS-CG. 

The basic scheme is the most time-consuming algorithm, taking about 
4—10 times longer to converge than the inexact (Quasi-)Newton methods. 
Note that for all load cases except the 0° loading, the iteration counts 
of the basic scheme fluctuate significantly between load steps, even 
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Figure 3.11: Directionally solidified NiAl-Cr(Mo): Performance comparison of the two 
reference-material choices for the basic scheme for the 15° load case 


after the strain field stabilizes and the creep rate reaches its minimum 
value. This unexpected effect is a result of our choice of reference 
material ao = (a; + a_)/2, which is only theoretically justified for 
materials whose tangent has a lower bound. For our given material, 
this cannot be assured globally, due to the prescribed softening behavior. 
However, convergence of the basic scheme to a critical point can be 
shown for materials with only an upper bound on the tangent if the 
reference material is chosen as ag = a1, see Sec. 1.2.3 in Nesterov’s book 
(Nesterov, 2004). We compared the two choices for ag for the 15° load 
case where the fluctuations were most pronounced, see Fig. 3.11. For 
the conservative choice ap = a+, iteration counts and runtimes develop 
smoothly. However, the mean iteration count and computation time per 
load step are about 30% higher for this choice. Hence, the results for 
ao = (a, + a_)/2 were included in the performance comparison of the 
different solution schemes. 
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3.5 Conclusions 


Quasi-Newton methods, such as Anderson acceleration (Shantraj et al., 
2015; Chen et al., 2019b;a) and the Barzilai-Borwein method (Schneider, 
2019a), have attracted considerable attention for FFI-based microme- 
chanics. In contrast to the classical Newton method, these schemes do 
not require computing the Hessian. In addition, they generally outper- 
form gradient-descent methods which share this property (Nocedal and 
Wright, 1999). In the present chapter, this motivated us to exploit the 
most popular Quasi-Newton algorithm, the BFGS method, in the context 
of FFT-based micromechanics. First, we proposed an implementation 
of Nocedal’s L-BFGS algorithm (Nocedal, 1980). While this scheme 
proved to be faster than the similar Anderson acceleration, pioneered by 
Shantraj et al. (2015), L-BFGS performed worse than the Barzilai-Borwein 
method which is non-monotonic but has a smaller memory-footprint. 
This can be traced back to the comparatively high computational cost 
per iteration of L-BFGS, due to the many inner product evaluations in 
the classical two-loop algorithm, see Alg. 3. It may be possible to reduce 
this computational overhead, using the more sophisticated L-BFGS 
implementation proposed by Chen et al. (2014), where the computation 
of all inner products can be parallelized more effectively. However, for 
material laws which can be cheaply evaluated, the Barzilai-Borwein 
scheme currently represents the general purpose method of choice 

For computationally expensive material laws, such as single-crystal 
plasticity, it has been shown that Newton-CG is more efficient, due 
to the lower number of gradient (and thus material law) evaluations, 
see Sec. 4.6. This led us to our second use of the BFGS update for 
approximating the material tangent-stiffness in the Newton-CG scheme. 
With the resulting BFGS-CG method, we arrived at a scheme which 
was competitive in performance to the classical Newton-CG method, 
in particular for multistep loads. Although it can not be measured 
in performance benchmarks, time spent programming is as much of a 
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resource as time spent on computations. Thus, the main advantage of the 
BFGS-CG scheme is that it enables the tangent-free implementation of 
complex and computationally demanding material laws while still being 
fast enough to permit their efficient computational homogenization. The 
results of the performance comparison between the investigated solution 
schemes are summarized in Tab. 3.10. 

As a side product of our investigation of (Quasi-)Newton methods, we 
found a globalization strategy suitable for FFT-based micromechanics 
in the line search algorithm of Dong (2010). Another aspect of major 
importance for the overall performance of these schemes was the choice 
of the forcing term. Among the various strategies tested in our numerical 
experiments, consistently solving the linear system to a high accuracy 
was by far the slowest option. Whereas this increased the overall 
computation time by factors of 5 to 7 compared to the other choices, the 
resulting convergence rate with respect to the required Newton iterations 
was barely improved within the given tolerance. The best overall 
performance was achieved by forcing term choice 2 of Eisenstat-Walker 
and its associated safeguards (Eisenstat and Walker, 1996; Kelley, 1995). 
However, similar performance was observed for a constant moderate 
forcing term of 0.1. Thus, the choice between these two options can be 
seen as a matter of preference, i.e., choosing optimal performance versus 
ease of implementation. 

As demonstrated in our numerical experiments, both Newton-CG and 
BFGS-CG can handle non-linear materials with infinite contrast. Conse- 
quently, they are among the most widely applicable algorithms currently 
available in the FFT-based context. However, the robust handling of 
materials with negative tangent eigenvalues, e.g., in case of damage or 
strain-softening, is an open topic for further research. Dai demonstrated 
that the BFGS method does not converge for general functions in four 
or higher dimensions (Dai, 2013). Damped versions of the BFGS update 
formula are available, see Procedure 18.2 in Nocedal and Wright (1999), 
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Table 3.10: Summary of the performance comparison between the investigated solvers 


Memory in 
oe k 
Solver strain-fields Summary and remarks 
e Gradient descent method 
: eL t memory requirement 
Basic scheme 1 |. 


e Slowest among the studied 
solvers 


e Limited-memory 
Quasi-Newton method 
Anderson acc. 2m +2 e Optimal depth m between 2 and 5 
e Accelerates the basic scheme 
but slower than the remaining 
algorithms 


e Limited-memory 
Quasi-Newton method 
L-BFGS 2m +4 e Optimal depth m between 2 and 5 
e Outperformed by the more 
memory-efficient BB method 


e Gradient descent with 
Quasi-Newton based step size 
BB 2 e Non-monotonic convergence 
e Fastest choice for inexpensive 
material laws 


e Inexact Newton method 

e Highest efficiency in combination 
with forcing term 2 by 
Eisenstat and Walker (1996) 

e Requires computing the 
material tangent 

e Fastest choice for expensive 
material laws 


Newton-CG 8.5 


e Inexact Quasi-Newton method 

e Uses the BFGS update to 
approximate the material tangent 

e Matches performance of 
Newton-CG for small load steps, 
slightly slower otherwise 


BFGS-CG 10.5 
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which stabilize the convergence behavior of the linear solver. Still, 
this may result in overall divergence if the disagreement between the 
tangent and its approximation becomes too large. It remains to be 
investigated, if a suitable approach such as the arc-length method as 
used for conventional finite-element computations (Wriggers, 2008) can 
be adapted for FFT-based micromechanics. 
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Chapter 4 


An efficient solution scheme for 
small-strain crystal 
elasto-viscoplasticity in a dual 
framework’ 


4.1 Introduction 


For polycrystals, evaluating the viscoplastic constitutive material law 
of single-crystalline phases is computationally expensive. The required 
iteration count of the original basic scheme is proportional to the material 
contrast, i.e. the quotient of largest and smallest eigenvalue of the 
tangential stiffness, evaluated for the entire microstructure. Even for a 
polycrystal consisting of a single crystalline phase, the internal mate- 
rial contrast can become large as a result of plastification. Lebensohn 
et al. (2012) adapted the augmented Lagrangian scheme, introduced by 
Michel et al. (2001), to small-strain crystal-elasto-viscoplasticity. The 
algorithm belongs to a class of polarization-based schemes (Moulinec 
and Silva, 2014; Schneider et al., 2019) whose required iteration count is 
proportional to the square root of the material contrast. Another class 


1 This chapter is based on Wicht et al. (2020a). For the sake of a coherent structure, 
formatting and typography of this thesis, minor changes have been made. To avoid 
redundancies in the text, the introduction has been shortened. 
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of fast solution methods with similar convergence rate was developed 
based on the interpretation of the basic scheme as a gradient descent 
method (Kabel et al., 2014), enabling the use of accelerated gradient 
schemes (Schneider, 2017a; 2019a). Gélébart and Mondon-Cancel (2013) 
and Kabel et al. (2014) applied the Newton-Raphson method to the 
FFT-context and used Krylov-subspace methods (Zeman et al., 2010; 
Brisard and Dormieux, 2010) for solving the corresponding linear system. 
As the Newton-Raphson method converges quadratically in the vicinity 
of the solution and Krylov-subspace solvers such as conjugated gradients 
are optimal for their respective problem class, these algorithms exhibit 
excellent performance, see, e.g. Kochmann et al. (2018), albeit at the 
cost of high memory requirements. In the case of crystal plasticity, the 
low number of required Newton iterations is especially beneficial, as the 
evaluation of the material law is much more costly than solving the linear 
system. Other approaches for decreasing the overall computational effort 
include the use of semi-explicit time integration schemes (Nagra et al., 
2017), spectral databases (Eghtesad et al., 2018b) and large-scale MPI 
parallelization (Eghtesad et al., 2018a). 

Except for the polarization-based schemes, all listed methods are formu- 
lated in the conventional strain-based setting which we revisit in Sec. 4.2. 
This study is based on the observation that for certain formulations 
of small-strain single crystal elasto-viscoplasticity, the evaluation of 
the inverse material law, i.e., computing the strain as a function of the 
stress, is much cheaper than the conventional approach, see Sec. 4.3. 
Bhattacharya and Suquet (2005) formulated a dual variational setting, 
see Sec. 4.4, for the unit cell problem and used the basic scheme as solver. 
In this chapter, we exploit the cheap evaluation of the inverse law in the 
dual stress-based setting using modern solution schemes, see Sec. 4.5. 
We compare the performance and convergence behavior of the solvers 
in both settings for a polycrystal and a fibrous NiAl-Mo microstructure 
in Sec. 4.6. 
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4.2 Computational homogenization 


4.2.1 The cell problem of periodic homogenization 


In this section, we review the cell problem of computational homog- 
enization for geometrically linear continuum mechanics with simple 
materials, see Ch. 2 and 4 in Bertram (2011). Let Y be a rectangular 
cell in R? and let L?(Y;Sym(d)) denote the space of Y-periodic and 
square integrable stress and strain fields, where Sym(d) denotes the set 
of symmetric d x d matrices. Let ee L?(Y;Sym(d)) be the infinitesimal 
strain field and denote by o € L?(Y;Sym(d)) the stress field. As we 
consider both strain and stress based formulations in this chapter, we 
wish to clearly distinguish between the stress field and the stress operator, 
i.e., the material law. Hence, we denote the heterogeneous and possibly 
non-linear but point-wise invertible material law F : Y x Sym(d) > 
Sym(d), so that o = F(e). The material law may result, e.g., from the 
implicit time discretization and static condensation of a generalized 
standard material. In computational homogenization, we seek a solution 
to the set of equations 


e=(e)y +V*u, and divo=0, (4.1) 


where V° denotes the symmetrized gradient operator and u: Y + R? 
is a periodic and mean-free displacement fluctuation field. To prescribe 
(possibly mixed) boundary conditions necessary for the closure of the 
system (4.1) of equations, we follow Kabel et al. (2016). Let P and Q be 
projectors on Sym(d) which are idempotent and complementary 


P:P=P, Q:Q=Q, P:Q=0, Q:P=0, P+Q=I, (42) 
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as well as orthogonal with respect to the Frobenius inner product 
(S,T) + tr(ST), ie. 


tr(S[P: T]) = (TIP : S]), VS,T € Sym(d), (4.3) 
tr(S[Q: T]) = tr(T[Q: S]), VS,T € Sym(d). (4.4) 


The macroscopic loading is encoded in the prescribed strain € € Sym(d) 
and stress g € Sym(d) with 


P:ze=ze and Q0:7=70. (4.5) 
The boundary conditions are formulated as 


P:(e)y == and Q:(o)y =o. (4.6) 


4.2.2 Variational formulation of the cell problem 


Under additional assumptions, the set of equations (4.1) and (4.6) can 
be derived from a variational principle. Assume an energy density 
w : Y x Sym(d) — R is given. For instance, w can be given as a 
hyperelastic energy or the statically condensed incremental potential 
of a generalized standard material (Lahellec and Suquet, 2007). Let 
w € C" ine and assume the stress can be derived from the hyperelastic 
relation o = 3% (e), where we suppress the x € Y-dependence. Consider 


the minimization problem (Kabel et al., 2016) in terms of the strain 
fluctuations ê = £ —€ 


W(é) —min for @€U c L?(Y;Sym(d)) (4.7) 


with 
W(é) = (wE +ê) -T : êy- (4.8) 
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The subspace under consideration is 


U = {ê € L(Y; Sym(d)) 


ê= (&), + Vu, 


(4.9) 
ueHy(Y;Sym(d)), P: (êy =0} 


where H, (Y ; Sym(d)) denotes the Sobolev space of periodic and mean- 
free vector fields u : Y + R@. Denote by DW (ê) the differential of W. 
Critical points of W are characterized by, 


DW(é)[S]=0, YVSeU where 


w 4.10 
DWO = (Fee +e) 18-78) . ee 


By the Helmholtz decomposition of elasticity, see. App. A, the operator 
T = V‘(div V*)~*div is a projector onto the mean-free and compatible 
fields’. Hence, we can write the variation S' in (4.10) as 


S=Q (S) +r S: (4.11) 


Inserting this expression into (4.10) we obtain 


o 
(+a (Ay): (Fo -7) ; 5) = 0, (4.12) 
E Y 
and as S is arbitrary, this is equivalent to 
Ow Ow 
Ir =o :=—(e) = 0. 4.1 
Q (Fe) a and T gO 0 (4.13) 


The condition I : ø = 0 is equivalent to div ø = 0. Thus, with our initial 
choice of U, we have recovered (4.1) and (4.6). 


2 Here, we chose C° = I. Later, the reference material is reinterpreted in the context of 
gradient descent methods as a parameter for the step size in Sec. 4.5. 
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4.2.3 Lippmann-Schwinger equation 


The basic scheme by Moulinec and Suquet (1994; 1998) is based on the 
Lippmann-Schwinger equation of elasticity 


e=E+D°:5- (T°+D°:Q: (-)y): (Fle) -—C®: €) (4.14) 


with the homogeneous reference stiffness C° : Sym(d) — Sym(d), the 


reference compliance D° = (CP)! and the strain-based Green operator 
T° = V*(div C’V°)-!div.. Throughout this paper, we assume that the 
reference stiffness is a multiple of the identity C° = ao I. The important 


property here is that C° commutes with Q and P, for a formulation with 
general C°, see Kabel et al. (2016). Solving (4.14) is equivalent to solving 
(4.1) with (4.6). More precisely, all e for which (4.14) holds are solutions 
of the system (4.1) of equations with boundary conditions (4.6) and vice 
versa. A derivation for P = I can be found, for instance, in Chapter 12 of 
Milton’s book Milton (2002). The fixed-point scheme associated to the 
Lippmann-Schwinger equation (4.14) 


Engi =E+D°:5- (I° +D: Q: (Jy): (Flex) —C°: ex), (4.15) 


is precisely Moulinec-Suquet’s basic scheme. The operator T° is evalu- 
ated in Fourier-space. 

Concerning the computational cost of a fixed-point iteration (4.15), we 
can distinguish between two cases. If the evaluation of F (e) is cheap, e.g., 
for linear elastic materials, most time is spent with the application of T° 
and the associated Fourier transforms. However, for more complicated 
material models, the computation of F(¢) dominates the runtime. As we 
will discuss in Sec. 4.3, single crystal elasto-viscoplasticity falls firmly 
into the latter category. Based on the observation that under certain 
assumptions the inverse ¢ = F~} (ø) is much easier to compute in this 
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case, the stress-based formulation of the cell problem will be discussed 
in Sec. 4.4. 


4.3 Material model for single crystal 
elasto-viscoplasticity 


4.3.1 Constitutive assumptions 


In small-strain plasticity, it is assumed that the strain can be additively 
decomposed 
E = Ee + Ep (4.16) 


into an elastic part £e and a plastic part £p, see Ch. 2 in Simo and Hughes 
(1998). For linear elastic behavior, the stress is related to the elastic strain 
via Hooke’s law 

o =C: ee =C: (e— ép) (4.17) 


with the stiffness tensor C : Sym(d) — Sym(d) or in strain-explicit form 


mm 


iE Fep (4.18) 


with the compliance tensor D = C~!. In elasto-viscoplasticity, the 


evolution of the plastic strain is given by a constitutive flow rule of 
the form é, = r (ø, z) with a finite number of internal variables z (Simo 
and Hughes, 1998). For crystalline materials, we assume that the plastic 
deformations are realized in the form of simple shears on crystallo- 
graphic slip systems, see Ch. 10 in Bertram (2011). Slip system are 
characterized by their slip plane normal n and their slip direction d. 
They signify close-packed planes and directions in the crystal lattice, 
respectively, see Ch. 3 in Hull and Bacon (2011). Hence, in single-crystal 
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elasto-viscoplasticity, the flow rule takes the form (Bishop, 1953) 


N 
a=), taMe (4.19) 


where 7, denotes the plastic slip rate and Ma = da ®° Nna denotes the 
symmetrized Schmid tensor on the ath of N slip systems, respectively. 
Plastic slip in a system is activated by the projected shear stress Ta = 
a - Ma (Bishop, 1953). Thus, for the constitutive flow rule for the slip 
rate we assume the form 


Ya = f (Tas Ta) (4.20) 


where 7Ë denotes the scalar critical shear stress in system a (Maniatty 
et al., 1992; Cuitifio and Ortiz, 1993). In the current work, we only con- 
sider isotropic hardening and neglect the effects of kinematic hardening. 
To complete the set of constitutive equations, hardening relations for 7 
have to be provided. In the following, we adapt the simplification that 


F 


the critical shear stress is equal in all slip systems 7" = 7 and depends 


on the accumulated plastic slip 


N 
y= XJ Kall (4.21) 
a=l 


in the form of of a hardening law TF 


= h(y). For instance, h may arise 
as the integrated form of a Kocks-Mecking type dislocation storage- 
recovery model (Kocks and Mecking, 2003). Kubin et al. (2008) found 
that the reduction to a single hardening variable was a reasonably good 
approximation for fcc crystals. In numerical experiments, Maniatty 
et al. (1992) found that the impact of this simplification on the effective 


mechanical properties of a polycrystal was small. 
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4.3.2 Formulation as a generalized standard material 


An isothermal generalized standard material at small strains with inter- 
nal variables z is described by two convex potentials ~ and ¢ (Halphen 
and Nguyen, 1975; Germain et al., 1983). The volume specific Helmholtz 
free energy density y defines the stress-strain relation and the driving 
force A associated to z by 
Ow 


c= ae z) and A=-——(e,z) (4.22) 


and the dissipation potential ¢ relates the driving forces to the rates of 
the internal variables 


A € 0¢(2) (4.23) 


where 0¢ denotes the subdifferential of ¢. In terms of the Legendre 
transform of (2) 


p“ (A) = sup(A-2 — $(2)), (4.24) 
the evolution of z can be equivalently written as 
2 € OO" (A). (4.25) 


For generalized standard materials, Lahellec and Suquet (2007) show 
that after a backwards Euler time discretization there exists a condensed 
incremental potential w(e) so that the potential relation 


o= = (e) (4.26) 


holds. For the crystal plasticity model, we assume a free energy of the 
following form 


Weep) = Zleep) C: (e-e) tW) 427) 
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with internal variables z = {£ep,y} which is additively split into a 
quadratic elastic energy and an isotropic hardening energy yy. The 
functional dependency of Yn on y is phenomenological and assumed 
here for the sake of simplicity. This ansatz for the free energy leads to 
the stress-strain relation (4.17) and the driving forces 


oy 


ð 
“a a = a and "= ——(Y), (4.28) 


Op = 
hence op = o and A = {o,—7*}. In viscoplasticity, flow rules are 
generally formulated in terms of the stress, see Fritzen and Leuschner 
(2013) and Ch. 2 in Lemaitre and Chaboche (1990). Therefore, the dual 
dissipation potential ¢*(o,7") is usually prescribed, so that 


(€;—4) E€ 08" (0, TF). (4.29) 


A common ansatz is the Chaboche-type potential, see Chapter 6 in 
Lemaitre and Chaboche (1990), 


E E (4.30) 
f m+1 oar TD J i ` 


with reference slip rate 70, drag stress Tp, stress exponent m and the 
Macaulay brackets defined by (-); = max(0,-). Differentiating w.r.t o 
and 7" recovers the evolution equations (4.19) and (4.21) with the flow 
rule 


ar leak 
Ya = Yo SZN (Ta) (BZ) ; (4.31) 
TD + 


see Fritzen and Leuschner (2013). Another popular approach for the 
evolution the plastic slip is 


m 


Ya = Yo Sgn(Ta) F (4.32) 
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by Hutchinson (1976). Steinmann and Stein (1996) proposed the associ- 


ated potential 
To m+1 
zF 


F- 

* Fy 7 Yo 

ONT) kl 
asl 


(4.33) 


which recovers (4.19) with flow rule (4.32). Note that the resulting 
equation for the accumulated slip 


| lal (4.34) 


corresponds to the standard formulation (4.21) only in the rate-independent 
limit as m — oo and Ta ~ T". Consequently, a crystal plasticity model 
as described in Sec. 4.3.1 with Hutchinson’s flow rule (4.32) is not a 
generalized standard material. 


4.3.3 Evaluation of the material law 


Applying the implicit Euler time discretization to the evolution equations 
(4.19) and (4.21) yields the residual equations 


N 
0=r1(0,7) =D:o-c+5 +At) f(ta,h(7))Ma, (4.35) 
| r a=1 
0 = ralo, y) = =y +7 t+ At Y [f(t RO). (4.36) 
a=1 


In the primal setting, the material law o = F(e) is evaluated by comput- 
ing the stress o for a given strain e, time step At and internal variables 
Ep, Y". To this end, the set of 7 equations, (4.35) and (4.36), can be solved 
by adapting the Newton-Raphson method. With 


(ex fi or. ort 
r= , r= , and J=| 3 g (4.37) 
Y T2 ð əy’ 
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the Newton iteration reads 

rtl =g” +AT (4.38) 
where Az is the solution of 

JAg = —r(a”). (4.39) 


Solving r(x) = 0 is challenging for large stress exponents m and large 
time increments, see Wulfinghoff and Böhlke (2013), as the system 
becomes ill-conditioned. To obtain fast and robust convergence behavior, 
we solve the residual equations with the outlined solution scheme for a 
reduced stress exponent m, starting with m = 1. Subsequently, m is set to 
min(2m,m) and r(x) = 0 is solved again with the last converged solution 
as starting point. Thereby, each Newton scheme is initiated close to the 
solution and converges quickly. This process is repeated until the system 
is solved with m = m. An alternative routine which relies on piecewise 
linearization of the flow rule was proposed by Wulfinghoff and Böhlke 
(2013). Regardless of the chosen approach, evaluation of J(x) and r(x) 
as well as the solution of (4.39) are computationally expensive. Thus, 
evaluating the material law o = F(e) dominates the overall runtime. 
The evaluation of the inverse material law e = F~'(c), however, is 
much less costly. For given a, At, Epi and 7", the scalar equation (4.36) 
can be solved for y independently of e. If y is known, e can be explicitly 
computed from (4.35). Thus, in the dual setting, the implicit material 
law only involves the solution of a single scalar equation instead of a 
system of 7 equations. The fact that < = F~! (ø) is cheaper to evaluate 
than o = F(e) has been taken advantage of in the context of polarization- 
based methods by Lebensohn et al. (2012). However, due to their chosen 
augmented Lagrangian scheme, see (Michel et al., 2001; Schneider et al., 
2019), the solution of a non-linear system of 6 equations was still required 
in every material point. 
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4.4 The dual variational framework 


A dual formulation of the cell problem (4.1) with the stress o as primary 
unknown was presented by Bhattacharya and Suquet (2005) for pure 
stress boundary conditions, i.e. Q = I. In analogy to Sec. 4.2.2, we will 
derive the dual case for mixed boundary conditions through a variational 
approach. Let the strain energy density w be convex in € and let 


w*(o) = sup (ao: €—w(e)) (4.40) 
geL2(Y;Sym(d)) 


be the Legendre transform of w. As w € C! implies w* € Ct, the inverse 


material law is given by e = ue (o). 


We seek a minimizer of the problem 
W*(ô) — min for G€U*C PY; Sym(d)) (4.41) 


with 
W*(6) = (w* (6+ 6) —E:G)y (4.42) 
and 
U* = {ô € L?(Y;Sym(d))|divs=0, Q: (6), =0} (4.43) 
where ô = o — F € V, see Appendix B. A critical point is characterized 
by 
DW*(o)[T] =0, VTeU* with 


* 4.44 
2 @+8):7-8:T) ; co 
Oo > 


Dw*(o){t] = ( 


The restriction T € U* can be expressed as 


T=P:(T)y+A:T (4.45) 
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by introducing the operator A = I-(-), — T from the Helmholtz 
decomposition. A is the orthogonal projector onto the divergence- and 
mean-free fields. Hence, the optimality condition can be written as 


(+P: Cer E (0) -2) 2); sÜ (4.46) 


This yields the Euler-Lagrange equations 


Ow* _ _ ðw” u 
P: ( Io (o)) =e and A: Bq (7) =% (4.47) 


which recover (4.1) and (4.6) with the initial restrictions on 6. 


4.5 FFT-based solution schemes for the cell 
problem 


4.5.1 The basic scheme 


The basic scheme of Moulinec and Suquet (1994; 1998) was interpreted as 
a gradient descent method by Kabel et al. (2014). Thus, the convergence 
theory for gradient descent became available in the setting of FFT- 
based schemes. In addition, accelerated gradient schemes could be 
applied to FFT-based homogenization (Schneider, 2017a; 2019a). In 
the following, we review the general formulation of gradient descent 
methods and discuss their application to the primal and dual framework 
of computational homogenization. Consider a minimization problem of 


the type 


f(z) — min, (4.48) 
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for a continuously differentiable function f on a Hilbert space V. Critical 
points of f are characterized by 


Vi(z) =9, (4.49) 
where the gradient is defined by 
Df(z)le] = (Vf), vv, vey, (4.50) 


with the inner product (-,-)y associated with V. The gradient descent 
iteration, see Ch. 9 in Boyd and Vandenberghe (2004), for the solution of 
this problem is given by 


Tr = Tk — WV f (xk) (4.51) 


which converges for sufficiently small step size y. Suppose f is strongly 
convex and has a Lipschitz continuous gradient, i.e. 


(Vf(z) Vfl) -yv > alle- yll? Yz eV, (4.52) 
IVE) -—VEW|lv <ZLle-ylv Ve eV, (4.53) 


with positive constants u and L. Then the optimal choice for the step 


size is given by 
2 


= —r 4.54 
H+L ( ) 


Vk 


see Ch. 1 and 2 in Nesterov (2004). In the following, we apply the 
gradient descent method to the primal and dual minimization problems 
associated to the cell problem of computational homogenization, see 
Sec.s 4.2.2 and 4.4. With property (4.50), we identify the gradients in the 
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primal and dual case from (4.12) and (4.46) as 


VW=(T+Q:()y): (Fo -7), (4.55) 
VW*=(A+P:(),): (= (c) -?) (4.56) 


€ 


m 


41 = E+ 7 — WT +Q: ()y): (Se) = ur) 


Ow* 1 
ones = +E WAP: Cy): (Fox) Eo), 
where the identities 
Ek =E+(T+Q:()y):e~n and n=o+(A+P:(-)y):or, (4.58) 


are used. Introducing the reference material C° = 1/7, 1 in the pri- 
mal formulation and D°? = 1 /V« Tin the dual formulation, we recover 
the basic scheme by Moulinec-Suquet and the dual basic scheme by 
Bhattacharya-Suquet 


Epir = =+D° Io = (T° + 0, Q: (y) : (Flek) = 0": Ek), (4.59) 
Ok+1 = °+C°:8- (C° -A°4C°:P: Ay): (Fox) _p°: Ck), 
(4.60) 


with F(e) = %%(e) and F~!(c) = ® (a). For the problems at hand 


(4.7) and (4.41), the inequality conditions (4.53) translate to 


2 
a_1<C®<a,1 with C IM, 
(4.61) 
A w 
6-I<D™ <68, I with D™= À 
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where a, and a are, respectively, the smallest and the largest eigen- 
value of the tangent stiffness C®® and 3, and 8_ are the smallest 


and the largest eigenvalue of the tangent compliance DW", respectively. 


Thus, the optimal respective choice for the reference material w.r.t. the 
convergence rate of the scheme is 


co Ty and p= try 


(4.62) 


Due to similarities in structure, see Table 4.1, the dual scheme can be 
easily implemented into an existing strain-based code. Moreover, all 
accelerated gradient schemes which have been introduced in the primal 
context (Schneider, 2017a; 2019a) carry over to the dual case. 


Table 4.1: Summary of quantities for the gradient descent algorithm (4.51) in strain- and 
stress-based setting 


Strain-based setting Stress-based setting 
a E o 
fa)  (w(e)-7: ey (w*(o) -E:o)y 
Va) TQ: (y): (8e) -7) (AFP: (y): (lo) -2) 
Sk ee D = +I 


4.5.2 The Barzilai-Borwein basic scheme 


Motivated by Quasi-Newton methods, Barzilai and Borwein (1988) 
published an iterative algorithm for the selection of the step size 7, 
in (4.51) which greatly increases the rate of convergence compared to 
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the choice in (4.54). Two recursive update formulas for the step size 


ao (4.63) 


LA (: MMORE 


and 


m= ne ( IY Ferl = (V ler), YEE) ) 
IV FIR = AV Fen), V e) + IV Fey 


(4.64) 


were proposed. The method was applied to FFT-based homogenization 
by Schneider (2019a) and displayed excellent speed and robustness 
while using only twice the memory of the basic scheme. Throughout 
this paper, we only consider the second variant (4.64) as it exhibited 
better performance for the given material models. For the initial step 
size 7°, the step size of the basic scheme was found to be a decent choice. 
Note that, due to the recursive nature of the step size selection, the 
eigenvalues of the tangent are only needed in the first gradient descent 
iteration. In the case of stress-based crystal plasticity, this property is 
especially favorable. As the cost of evaluating the inverse material law is 
comparably cheap, see Sec. 4.6, the additional computation of the tangent 
and its eigenvalues significantly increases the overall computational 
effort. 


4.5.3 The Newton-CG method 


Newton-Raphson methods are ubiquitous in computational mechanics 
as a solution algorithm for nonlinear systems of equations. In the context 
of minimization, where V f(x) = 0 is to be solved, the damped Newton- 
Raphson iteration reads 


Tkl = Ek — ap H * (te) V Sf (ae) (4.65) 
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with the Hessian H of f and a damping factor a; € (0, 1], see Ch. 9 in 
Boyd and Vandenberghe (2004). Instead of inverting H(z), the update 
can be performed by 

Tk+1 = Zk + Ar (4.66) 


where Az is an approximate solution of 


The damping factor a, is determined by a back-tracking procedure. In 
this paper, we use the stopping criteria of Dong (2010) 


c2(V f (£k), Az)v < (Vf (re + ax Ar), Az)v < (Vf (re), Az)v (4.68) 


with 0 < cı < cg < 1. In contrast to the Wolfe conditions (Wolfe, 1969), 
Dong’s criteria rely solely on gradient evaluations. This is beneficial, as 
evaluating f requires either the primal or dual condensed incremental 
potential, see Tab. 4.1, which is generally not available in FFT-based 
homogenization. Both w and w* carry no physical meaning as they 
depend on the chosen time discretization and are composed of primal or 
dual free energy and dissipation potential, respectively. 

In the vicinity of a stationary point, the Newton-Raphson method con- 
verges quadratically. However, it can be difficult to actually obtain such 
a convergence rate in practical application. For large problems, (4.67) is 
usually solved iteratively up to a certain tolerance. Solving for Ax with 
the accuracy required for quadratic convergence is generally not feasible 
with respect to the overall computational effort (Knoll and Keyes, 2004). 
The Newton-Raphson method has been applied to FFT-based ho- 
mogenization both in the small- and finite-strain setting (Gélébart 
and Mondon-Cancel, 2013; Kabel et al., 2014) in combination with 
Krylov subspace solvers (Brisard and Dormieux, 2010; Zeman et al., 
2010). The linear Newton-Raphson equations corresponding to the 
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Lippmann-Schwinger equations (4.59) and (4.60) read 


[I+HT°+D°:Q: _ : (C(e) —C°)] : Ae 
=D’:#-(T°+ Q: ¢)y) : Fler), (4.69) 
[T+(C°: A0 +0: P P: Oy ) : (D (or) — D®)] : Ao 

= Č : g — (C° : A? +C°:P: (),):F (or). (4.70) 


The algorithmic solution of this type of equation using the conjugate 
gradient (CG) method was outlined, e.g., by Kabel et al. (2014). 


In the context of FFT-based homogenization, the performance of New- 
ton’s method in comparison to other solution schemes depends heavily 
on the material law. If the evaluation of the material law is cheap 
and comparable to the application of the tangent, then Newton- and 
CG-iterations have similar computational cost. In such a case, fast 
gradient methods outperform the Newton-CG method, considering 
the overall number of iterations (Schneider, 2017a). On the other hand, 
if the material law dominates the overall runtime and the cost of the 
CG-iterations is small in comparison, then Newton-CG is the method of 
choice, see Sec. 4.6. However, the memory requirements of the algorithm 
are steep. Whereas the basic scheme and the Barzilai-Borwein method 
can be implemented with one and two strain-like fields respectively, 
the Newton-CG method requires the last converged solution and 4 
additional fields for the CG algorithm. In 3 spatial dimensions, the 


additional storage of the tangent operator C'" or DW" corresponds to 
21 scalars in each voxel, further increasing the required memory to 8.5 
strain-like fields. We have found, however, that the last converged 
solution and the tangent operator can be stored in single precision 
without significantly affecting the convergence of the Newton scheme. 
Thereby, the memory footprint can be reduced to 6.25 strain-like fields. 
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4.5.4 Eigenvalue precomputation for the stress-based 
formulation 


In the FFT-based homogenization of elasto-viscoplastic materials, we 
face certain challenges in the stress-based setting which are not present 
in the conventional strain-based case. In the following, we give an 
outline of the basic problem and present a remedy in form of appropriate 
preprocessing steps. 

Consider the first load step in which plastification occurs. In all dis- 
cussed solution methods, we start with a single iteration of the basic 
scheme. This raises the question how the reference material should be 
chosen, since we have no a priori information on the extremal eigen- 
values of the tangent stiffness and compliance. A natural choice is 
to consider the eigenvalues of the materials’ elastic stiffness C. We 
evaluate the adequacy of this choice for the primal case. With the onset 
of plastification, the lower bound a_ of the stiffness decreases and 
can even approach zero, depending on hardening and viscosity. The 
upper bound a+ stays fixed. It can be shown that gradient schemes 
converge for the step size 7, = + in case the energy f has a Lipschitz 
continuous gradient but is not strongly convex (Nesterov, 2004). Even 
for an arbitrary decrease of a_, the reference stiffness of the basic scheme 
C° = 3(a} + a) I changes at most by a factor of two compared to the 
elastic case. Therefore, it is guaranteed that in the first load step C° 
has at least the correct order of magnitude and the solvers can usually 


self-correct in the next couple of iterations. 
In the stress-based setting, however, we face a different situation. During 
plastification, the upper bound of the compliance 6, = + usually 


increases by orders of magnitude and 6_ remains fixed. The refer- 


ence compliance of the basic scheme D° = 3(ß, + _)I is roughly 
proportional to 3 in this case. Note, that in case of a_ = 0, e.g. for 
perfect elastoplasticity, the dual schemes cannot be used in this form 
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as 34 = +00. The error made by using the eigenvalues of the elastic 
compliance can be arbitrarily large and leads to an underestimation 


of D°, i.e. an overly large gradient step size. This negatively affects 
the convergence, as the solvers take a long time to self-correct. Hence, 
a reasonable estimate of 6, has to be determined before applying a 
solution scheme. 

Given an appropriate initial guess for the stress field, a cheap and suffi- 
ciently accurate method is to evaluate the material law and eigenvalues 
in the voxel with the highest von Mises stress. In a setting with multiple 
load steps, this works well in conjunction with an affine extrapolation 
of the primary field (Moulinec and Suquet, 1998). However, we lack 
an initial stress field in the first load step. The conventional choice 


of o? = F + D° : € is not useful for the eigenvalue precomputation. 


For common load cases such as strain-controlled uniaxial extension, 7 


vanishes and o° relies on D? which we want to estimate in the first place. 


Algorithm 4 Basic scheme for the Reuss mixing 


1: orn 0 
2: repeat 
N —1 
3: ER — x CG F; (oR) 
N OF! 
4 R dy G ge (OR) 
5 compute b+, 8- from 
6: 0 a cee I 
7 
8 


tan 


or} o+C°: (E—P: (er —D® : op)) 
: until convergence 
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To obtain a better estimate for the initial stress, we use the Reuss model, 
i.e. we search the stress or € Sym(d) so that 


O:R=7, 


N (4.71) 
P:r=E with er = X a Fr (or) 


holds, where c; denote the volume fractions of our constituents and 
F;' stand for their corresponding inverse material laws. Using the 
Reuss model can be seen as a minimization of W* in (4.41) under the 
assumption that the stress field is constant. We solve for the Reuss 
estimate or using the basic scheme which is presented in Alg. 4 for 
convenience of the reader. Since the algorithm only operates on a 
single stress matrix, its computational expense is negligible regardless of 
iteration count. The choice o° = or was found to be a good starting point 
for the FFT-based homogenization schemes and enables the estimation 
of Bx. 

Analogously, the Voigt average can be used to estimate the initial strain 
field in the primal case, i.e. find ey € Sym(d) so that 


P:ey =, 


N (4.72) 
Q:ovw=7 with ov= Ye Filev) 


7 


holds. This represents the minimization of W in (4.7) with a constant 
strain field. However, the effect on the performance of the strain-based 
FFT-solvers is rather small. 
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4.6 Numerical demonstrations 


4.6.1 Setup 


All algorithms were implemented in Python 2.7, supplemented by 
Cython (Behnel et al., 2011) extensions for the computationally expensive 
operations, i.e. the application of T° and A" and the evaluation of the 
material law. For the computation of the fast Fourier transforms, we 
relied on the FFTW library (Frigo and Johnson, 2005). The critical parts 
of the code were parallelized using OpenMP. The computations ran on 6 
threads on a desktop computer with 32 GB RAM and an Intel i7 CPU 
with 6 cores and a clock rate of 3.7 GHz. 
The staggered grid discretization (Schneider et al., 2016) was utilized 
throughout because of its superior performance for perfectly plastic 
material behavior. Notice that the Helmholtz decomposition, see Ap- 
pendix A, is available for the staggered grid discretization (Schneider 
et al., 2016). In case of multiple load steps, an affine linear extrapolation 
(Moulinec and Suquet, 1998) was used for the primary field. 
In this section, we will compare primal, dual and primal-dual algorithms. 
Primal algorithms are those based on the primal basic scheme, i.e. the 
Barzilai-Borwein method and the primal Newton method scheme. For 
these methods, a strain field serves as the variable to iterate on. For every 
iteration, the strain field is compatible, and the iterative scheme seeks 
an equilibrated stress field. The latter is quantified by the convergence 
criterion 

lasam en lle? _ 5 (4.73) 

Il (ox)y || 

see section 5 in Schneider et al. (2019) for details. The dual schemes are 
based on the dual basic scheme by Bhattacharya and Suquet (2005), and 
iterate on equilibrated stress fields, and repeat until the associated strain 
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field is compatible. We check this by the criterion 


po [Tet — ellz <ô, (4.74) 
I| (ex)y | 

which is simply the dual of the primal convergence criterion. Last 
but not least, we also briefly touch upon a primal-dual algorithm, the 
Eyre-Milton method (Eyre and Milton, 1999). This scheme iterates on 
a variable called polarization. Compatibility and equilibrium of the 
associated strain and stress fields, respectively, are only satisfied upon 
convergence. For the Eyre-Milton scheme, the consistent convergence 


criterion 
1 ||Pk+1 — Pilz 


2 Kay 


is used, see Sec. 5 in Schneider et al. (2019) for a derivation. 


(4.75) 


Due to the differences of these three schemes, the convergence criteria are 
not strictly comparable. Still, these criteria have been chosen to ensure 
the maximum degree of fairness in comparison, taking into account 
functional analytic and physical considerations. For our computations, 
we consistently used ô = 107°. 

Last but not least let us remark that, for the Eyre-Milton method, we 
use the complexity reduction trick described in Sec. 6 of Schneider et al. 
(2019), which reduces the complexity of a single Eyre-Milton iteration to 
the level of a primal basic step for small-strain crystal viscoplasticity. 


4.6.2 Polycrystalline microstructure 


Setup and material parameters. In the following section, we investigate 
a periodic polycrystalline microstructure with 81 grains and a resolution 
of 64° voxels. The microstructure was generated using the Voronoi 
tesselation routine of the software Neper (Quey et al., 2011) with uni- 
formly distributed grain orientations. We consider two single-crystal 
elasto-viscoplasticity models as described in Sec. 4.3.1 using the flow 
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— Model w. Chaboche’s flow rule 
= —— Model w. Hutchinson’s flow rule 
<x 
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Figure 4.1: Left: Polycrystalline microstructure (64? voxels) and accumulated plastic slip y 
at 1% tensile strain. Right: Stress-strain diagram of the material models for a tensile strain 
rate of 0.001/s 
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rules of Chaboche and Hutchinson, respectively 


Tale 


(Tal =F 
TF 


F n 
) i Ya = Y SZN(Ta) (4.76) 


TD 


Ža = insel) ( 


For the hardening law, we use a linear exponential approach based on 
the accumulated plastic slip y, see (4.21) 


Too — TO 


TF = To + (To — T0) (1 exp ( = —,)) t Oxy (4.77) 


where To denotes the initial yield stress, ©, and Oœ respectively denote 
the initial and asymptotic hardening modulus and 7,, denotes the satu- 
rated yield stress for ©% = 0. The material model using the Hutchinson 
flow rule is not a generalized standard material and its tangent stiffness is 
not symmetric. Hence, the convergence of the solution schemes outlined 
in Sec. 4.5 is not theoretically justified in this case. The performance of the 
solvers in combination with this flow rule is still of interest as it is widely 
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used (Lebensohn, 2001; Lebensohn et al., 2012) and therefore included 
in our investigations. For the tangent operators and its eigenvalues, 
we consider the symmetric part of the tangent stiffness or compliance. 
Additionally, we include Eyre-Milton’s method (Eyre and Milton, 1999) 
to our list of investigated solvers in the strain-based setting as similar 
polarization-based schemes have been widely adapted in the context of 
single-crystal plasticity (Lebensohn et al., 2012; Shantraj et al., 2015). For 
a discussion of the theoretical background, algorithmic parameters and 
the convergence criterion for this family of solvers, we refer to Schneider 
et al. (2019). 


Table 4.2: Material parameters of the single-crystal material models (Simmons and Wang, 
1971; Eghtesad et al., 2018a) 


Common parameters 

Stiffness Cy, = 170.2 GPa Ciz = 114.9 GPa 
C44 = 61.0 GPa 

Flow rule Yo = 0.001 s71 n = 20 

Hardening ©. = 250 MPa 0. = 14 MPa 
Too = 113.5 MPa 

Lattice type FCC 

Slip systems {111}(110) 


Model w. Chaboche’s flow rule Model w. Hutchinson’s flow rule 
To =6.5MPa Tp =8MPa To = 14.5 MPa 


All material parameters for the single-crystal plasticity models are listed 
in Tab. 4.2. The stiffness parameters correspond to OFHC copper at 
room temperature and are taken from Simmons and Wang (1971). For 
the model with Hutchinson’s flow rule, the viscoplastic and hardening 
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parameters of Eghtesad et al. (2018a) were used. Note that Eghtesad et 
al. prescribe a slightly different formulation for the linear exponential 
hardening law. However, the asymptotic behavior of their hardening 
approach is identical to the formulation in this study. The boundary 
conditions in the following numerical demonstrations correspond to 
a strain-controlled uniaxial tensile test, i.e. a uniaxial stress state, up 
to 1% strain with an applied strain rate of 0.001/s. We investigate two 
cases where the load is applied in a single step and in 50 steps of 0.02%, 
respectively. In case of the material model with Chaboche’s flow rule, 
the material parameters To and Tp were chosen so that the stress-strain 
curves for the given load case are roughly equivalent for both models, 
see Fig. 4.1. 


Convergence behavior and runtime for a single load step. We inves- 


—— Barzilai-Borwein — Newton CG — Eyre-Milton 
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(a) Primal setting (b) Dual setting 


Figure 4.2: Polycrystal: Residual vs. computation time (Chaboche flow rule) 


tigate the case of a single load step up to 1% strain in tensile direction 
using the material model with Chaboche’s flow rule. Fig. 4.2 compares 
the residual of the different solvers as a function of computation time. 
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Table 4.3: Polycrystal: Computation times and iteration counts (Chaboche flow rule) 


Primal Dual 
Basic scheme Comp. timeins 8950.6 5720.0 


Iter. count 4877 14774 
BB Comp. timeins 839.2 235.0 

Iter. count 480 821 
Newton-CG Comp. timeins 301.1 73.5 

Iter. count 151 139 
Eyre-Milton Comp. timeins 1241.0 - 

Iter. count 1141 - 


The basic scheme was omitted in this plot for the convenience of the 
reader, as its required computation time was an order of magnitude 
larger in comparison to the other schemes. All total runtimes are given 
in Tab. 4.3, together with the iteration counts. Note that we only counted 
Newton iterations and backtracking steps, i.e. evaluations of the material 
law, for the Newton-CG solver and omitted the CG iterations. 

For both the primal and dual setting, the Newton-CG solver exhibits 
the best performance. Due to the large load step, we start far from 
the converged solution. Consequently, the Newton-CG solver takes 
many smaller steps and backtracking iterations in the beginning. After 
reaching a residual of 1073, it converged rapidly. The Barzilai-Borwein 
method is the second fastest scheme and exhibits a non-monotonic 
convergence behavior, both for the primal and the dual setting. The 
convergence rate of the Eyre-Milton scheme is similar to the other solvers 
up to a residual of 107? but slows down considerably afterwards. 
Comparing the computation times of the solvers in the primal and dual 
case, we see a considerable increase in speed for the latter due to the 
cheaper evaluation of the inverse material law. For each solver, the 
computation time decreases by a factor of 1.5 to 4 in the stress-based 
setting. The second fastest solver in the dual setting, the Barzilai-Borwein 
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scheme, is still faster than the primal Newton-CG method, with the 
additional benefit of reduced memory consumption. 


— Barzilai-Borwein — Newton CG — Eyre-Milton 
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Figure 4.3: Polycrystal: Residual over computation time (Hutchinson flow rule) 


In the following, we consider the same load as in the previous section 
using the material model with the flow rule by Hutchinson, see Fig. 4.3 
and Tab. 4.4. Qualitatively, the convergence behavior of the solvers is 
similar to the case using Chaboche’s flow rule. In general, we observe 
lower iteration counts. This can be attributed to a higher tangent stiffness, 
i.e. lower inner material contrast, for large load steps when using 
Hutchinson’s flow rule. The largest effect can be seen for the Eyre-Milton 
scheme, where the iteration count decreases by a factor of 10. Asa result, 
it performs only marginally slower than the Newton-CG method in the 
primal setting. To achieve stable convergence, the Eyre-Milton scheme 
tracks the extremal eigenvalues over the entire solution history, see 
Schneider et al. (2019). Therefore, a low tangential stiffness can lead to 
a subsequent slowdown of the whole solution scheme, even if it only 
occurs in a single iteration. This explains the much slower convergence 
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Table 4.4: Polycrystal: Computation times and iteration counts (Hutchinson flow rule) 


Primal Dual 
Basic scheme Comp. timeins 6683.2 624.4 


Iter. count 2912 2762 
BB Comp. timeins 588.8 32.9 
Iter. count 267 309 
Newton-CG Comp. timeins 138.2 13.3 
Iter. count 55 31 
Eyre-Milton Comp.timeins 190.1 - 
Iter. count 95 - 


in case of the model with Chaboche’s flow rule. Furthermore, it is 
noteworthy that the speedup in the dual setting is larger in this case, 
with factors of 10 — 20 in computation times for each solver. This is a 
consequence of the lower iteration count and the cheaper evaluation 
of the inverse material law in case of Hutchinson’s flow rule. For the 
Chaboche flow rule, all overstress terms Tą — Tr have to be recomputed 
for each evaluation of the residual (4.36) while only Tp has to be updated 
in case of Hutchinson’s flow rule. This leads to a lower number of 
computationally expensive exponentiations in the latter case. The dual 
Barzilai-Borwein method notably profits from this fact and is only 2.5 
times slower than the dual Newton-CG method and 4 times faster than 
the primal Newton-CG method. 

Convergence behavior and runtime for 50 load steps. For the com- 
putations in this section, we applied the load of 1% tensile strain in 
50 steps. In analogy to the last section, we first consider the model 
using Chaboche’s flow rule. The computation time of the different 
solvers in each load step is plotted in Fig. 4.4. All solvers, except for 
Newton-CG in the primal setting, exhibit a peak in computation time 
in the second step with the onset of plastification. In the subsequent 
steps, the runtime decreases and reaches a stable value approximately 
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Figure 4.4: Polycrystal: Computation time of each load step (Chaboche flow rule) 


at step 20. To evaluate the overall performance, the mean computation 
times and iterations per step are listed in Tab. 4.5. For each solver, the 
iteration counts are similar in the primal and dual setting while the 
computation times are lower by a factor of 3 — 6 in the latter. As for the 
single load step, Newton-CG exhibits the best performance throughout. 
However, in the primal case, the Eyre-Milton scheme takes less than 
twice as long to converge with a third of the required memory. Similarly, 
the Barzilai-Borwein scheme is slower than Newton-CG by a factor of 
2.5 in the dual setting, with the identical low memory requirements as 
Eyre-Milton. 

Considering the material model with Hutchinson’s flow rule, the results 
are similar to the case using Chaboche’s flow rule in the primal setting, 
see Fig. 4.5 and Tab. 4.6. The stress-based computations run faster than 
for the model with Chaboche’s flow rule, due to the same reasons as 
in the single step case. However, the difference is less pronounced for 
small load steps as the inverse material law needs fewer iterations to 
converge. We further notice that the dual Barzilai-Borwein scheme is 
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primal dual 
Basic scheme Mean comp. timeins 150.0 50.5 
Mean iter. count 123.6 193.1 
Barzilai-Borwein Mean comp. time ins 31.7 5.6 
Mean iter. count 27.7 33.2 
Newton-CG Mean comp. time ins 9.6 2.4 
Mean iter. count 5.6 5.1 
Eyre-Milton Mean comp. timeins 16.5 - 
Mean iter. count 15.3 - 


Table 4.5: Polycrystal: Mean computation times and iteration counts (Chaboche flow rule) 


Table 4.6: Polycrystal: Mean computation times and iteration counts (Hutchinson flow 
rule) 


Primal Dual 
Basic scheme Mean comp. timeins 1185 15.2 
Mean iter. count 82.5 75.7 
Barzilai-Borwein Mean comp. timeins 27.9 1.8 
Mean iter. count 20.6 17.8 
Newton-CG Mean comp. timeins 11.3 1.4 
Mean iter. count 5.8 4.0 
Eyre-Milton Mean comp. timeins 14.6 = 
Mean iter. count 10.2 - 


close in performance to the dual Newton-CG method, being only 25% 
slower. While the iteration count of Barzilai-Borwein is 4.5 times higher 
in comparison, the iterations are much less costly as neither the tangent 
computation nor the solution of a linear system are required. 

Effective material properties. In the following, we discuss the effective 
elastic and plastic properties of the polycrystalline microstructure. The 
overall elastic behavior is characterized by the effective stiffness C : 
Sym(d) — Sym(d) which relates effective stress ¢ = (o), and effective 
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Figure 4.5: Polycrystal: Computation time of each load step (Hutchinson flow rule) 


strain £ = (ec), by 

5=C:: (4.78) 
assuming linear elastic behavior for the single crystalline phase. Using 
the elastic parameters in Tab. 4.2, the effective stiffness of the polycrys- 
talline structure, given in Voigt notation, 


194.1 103.0 102.8 -0.1 -02 0.7 
103.0 191.9 105.0 -15 0.3 0.4 
102.8 105.0 192.2 16 -02 —1.1 
—0.1 -15 16 465 -10 03 
-02 03 -02 -10 442 —0.2 
0.7 0.4 -11 03 -0.2 44.2 


was identified through 6 linear elastic computations. The isotropic part 
of the stiffness can be computed by 


. a l= 
C® = (C :: Pı)Pı + zC :: P2)P2. (4.80) 
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with the projector Pı onto the spherical d x d matrices and projector Ps 
onto the deviatoric, i.e. trace-free, d x d matrices. For the given effective 
stiffness, CO corresponds to a material with a Young’s modulus of E = 
120.8 GPa and a Poisson ratio of v = 0.35. In this case, the anisotropic 
part of the stiffness C2™S° = C — C*° is small with ||C#®°||/||C|| = 0.017 
where ||C|| = VC :: C. Hence, C' is a reasonable approximation of C. 
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Figure 4.6: Polycrystalline microstructure: Contraction ratio q at various load angles in the 
xz-, cy-, and yz-plane 


The plastic anisotropy of polycrystals can be characterized by the con- 


traction ratio . 
Ep : (No Q No) 


sor (4.81) 


q = 
which can be identified in a given plane by performing tensile tests at 
various angles, with the load direction ny and the orthogonal direction 
in the chosen plane no. Here, &, € Sym(d) denotes the effective plastic 
strain which is computed by 


&=E-D:5 with D= C7. (4.82) 
P 
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The contraction ratio is connected to the commonly used Lankford- 
coefficient r by q = +}. To characterize the plastic anisotropy of the 
polycrystalline microstructure, computations corresponding to uniaxial 
tensile tests were performed at various angles in the xz-, xy- and yz- 
plane using the elasto-viscoplastic material model with Chaboche’s flow 
rule. The resulting contraction ratios are plotted in Fig. 4.6, where 
the load angle is taken with respect to the x-direction for the xz- and 
xy-plane and with respect to the y-direction for the yz-plane. We observe 
that all contraction ratios q fluctuate around the value of 0.5 which 
signifies plastically isotropic behavior. The largest deviation is about 0.1. 


4.6.3 Directionally solidified NiAl-9Mo fiber structure 


Setup and material parameters. We investigate the high-temperature 


Strain_p_acc_0 


Figure 4.7: Directionally solidified NiAl-9Mo: Microstructure (1200 x 160 x 160 voxels) 
and accumulated plastic slip y after 50s 


creep of a NiAl-9Mo eutectic. Using directional solidification, this 
material develops a characteristic microstructure where well-aligned 
single-crystal Molybdenum fibers with square cross section are embed- 
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ded in a nickel-aluminum matrix (Bei and George, 2005). We consider a 
unit cell of 1200 x 160 x 160 voxels with a fiber volume content of 14% 
and a fiber aspect ratio of 100 (Haenschke et al., 2010), see Fig. 4.7. The 
microstructure was generated by a random sequential addition algo- 
rithm (Widom, 1966). The spatial resolution of the fibers is 8 voxels per 
edge length. Due to the large voxel count, we restrict the investigation 
to the fastest solvers, i.e. the Newton-CG method in the primal setting 
and the Newton-CG as well as the Barzilai-Borwein methods in the dual 


setting. 

Both materials are modeled according to Albiez et al. (2016a) using the 
Hutchinson flow rule (4.32). The nickel-aluminum matrix is assumed to 
be perfectly plastic, ie. 7” = r. For the molybdenum fibers, we use 


the approach in Albiez et al. (2016a) 


2 
F Too 1 po 
= —— th = 1- —<k 1-—,/— 
i dyp+1 ne a| exp ( 2 n) ( 2) 


with the maximum yield stress 7%, the characteristic length d, the 
dislocation density p, its initial value pọ, its saturation value p; and the 
recovery constant ka. Note that we neglect the Taylor hardening term 
present in Albiez et al. (2016a), as its contribution is small in this case. 
The material parameters for fibers and matrix at 1000°C are listed in 
Tab. 4.7. Owing to the large flow resistance of the fibers, the investigated 
composite has a high external material contrast in addition to the internal 
contrast caused by plastification. 

The boundary conditions correspond to a uniaxial compression load of 
250 MPa which is rapidly applied in a single load step of 0.001s which 
corresponds to a strain rate of 2/s in the load direction. The load is 
subsequently held for 100 load steps of 0.5s for a total time of 50s. 
Convergence behavior and runtime. In the first few steps after the 
rapid initial compression, a load transfer from matrix to fiber takes place 
in the form of a viscous effect. Consequently, the phase-specific loads 
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Table 4.7: Directionally solidified eutectic: Material parameters of fibers and matrix at 
1000°C (Albiez et al., 2016a) 


Molybdenum fiber Nickel-aluminum matrix 


Stiffness Cy, = 404 GPa Cy, = 182 GPa 
Cı2 = 163 GPa Ci. = 120 GPa 
C44 = 99 GPa C44 = 85.4 GPa 
Flow rule 40 = 8.96 s7! ah = 10's"? 
n = 10.5 n = 4.04 
Hardening Tə = 3833 MPa tË = 30.75 MPa 
d = 0.729 um 


po = 9x 10? mm~? 


ps = 2.3 x 10° mm? 


k2 = 66 

Lattice type BCC B2 

Slip systems {110}(111) {001} (100) 
{112}(111) {011} (100) 
{123} (111) {011} (110) 


and fields are not monotonic at the onset of creep, see Albiez et al. (2016a); 
Dudovä et al. (2011). This leads to an initial peak in computation time, 
before it drops around step 10 when the fields stabilize and extrapolation 
takes effect, see Fig. 4.8. The peak is most pronounced for the dual 
Barzilai-Borwein scheme which requires nearly the same time as the 
primal Newton-CG method in the first few steps but speeds up to the 
level of the dual Newton-CG method afterwards. Tab. 4.8 allows us to 
compare the overall performance of the solvers. The dual Newton-CG 
scheme is fastest overall, beating the primal Newton-CG by a factor of 
7. It is closely followed by the dual Barzilai-Borwein method, which 
requires 5 times as many iterations but only 50% more computation 
time. 

Effective material properties. A characteristic value for the creep 
strength of a material under a certain stress load is the minimal creep 
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Figure 4.8: Directionally solidified NiAl-9Mo: Computation time of each load step 


Table 4.8: Directionally solidified eutectic: Mean computation times and iteration counts 


Mean Comp. timeins Mean iter. count 


Newton-CG primal 1315.3 5.0 
Newton-CG dual 186.5 3.8 
BB dual 262.2 20.2 
rate ¿$ in- It is defined as the minimum of the strain rate in direction ng 


of the applied uniaxial stress load 
E =F: (ne Qne). (4.84) 


To investigate the anisotropic creep behavior of the NiAl-9Mo microstruc- 
ture, we performed creep computations with a compressive uniaxial 
stress load of 250 MPa at different angles with respect to the fiber 
direction, i.e. x-direction, in the xz-plane. The resulting creep curves 
and the corresponding minimal creep rate for each angle are depicted 
in Fig. 4.9. Computations in the xy-plane were conducted as well 
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Figure 4.9: Creep of directionally solidified NiAl-9Mo at different load angles with respect 
to the fiber direction in the xz-plane for an applied load of 250 MPa. Left: Creep curves. 
Right: Minimal creep rates 


and yielded similar results, indicating approximately isotropic creep 
behavior in the yz-plane. Several comments are in order. Compared 
to the computations by Albiez et al. (2016a), the minimal creep rate 
for a load applied in fiber direction is an order of magnitude larger in 
the present study. This can be attributed to the fact that Albiez et al. 
assumed infinitely long fibers. Hence, we observe a pronounced effect 
of the aspect ratio on the effective creep behavior, even for a large ratio 
such as 100. Considering the creep curves at varying load angles, a 
pronounced minimum is only present in the case where load direction 
and fiber alignment coincide. For all other load cases, the creep rate 
reached a stationary value and did not increase afterwards. This is due 
to the fact that the plastic deformation and the accompanying softening 
of the fibers is only activated under high loads as the initial yield stress 
of molybdenum is very large, see Tab. 4.7. The fibers carry such high 
stresses only in the 0° angle load case. In the other cases where stress 
load and fibers are misaligned, a larger part of the load is distributed 
to the less creep resistant NiAl matrix which leads to higher effective 
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creep rates. Even for a small misalignment of 15°, the minimum creep 
rate increases by more than two orders of magnitude compared to the 0° 
case. For the 45° and 90° case, the creep rate is similar to that reported 
for binary NiAl, e.g., by Seemiiller et al. (2013) or Whittenberger et al. 
(1991). This indicates that the reinforcing effect of the molybdenum 
fibers diminishes at these load angles. 


4.7 Conclusions 


Initially conceived by Bhattacharya and Suquet Bhattacharya and Suquet 
(2005) to tackle strain-locking materials, we found that the application 
of stress-based FFT-schemes can be beneficial in the case of small-strain 
single crystal elasto-viscoplasticity, due to the stress-explicit formulation 
of the plastic flow rule. Thereby, in our numerical examples, we were 
able to reduce the computation time by a factor of 2 — 20 in comparison 
to the strain-based setting. Further research could be invested to assess 
if other types of stress-explicit material laws can similarly profit from 
the stress-based formulation. 

For this study, we considered geometrically linear crystal plasticity 
models. To investigate finite deformations, explicit incremental update 
schemes as presented, e.g., by Lebensohn (2001); Lebensohn et al. (2008) 
could be applied after each converged load step. To the best of our 
knowledge, in a Lagrangian finite-strain setting, explicit solutions for 
the update of the inelastic deformations currently exist only for the case 
of viscoelasticity (Shutov et al., 2013). To our knowledge, this does not 
change in the dual case. 

Considering our investigated solution schemes, we found that a good 
initial approximation of the average load and the eigenvalues of the 
tangent was vital to achieve fast and stable convergence in the dual set- 
ting. To this end, suitable and computationally efficient precomputation 
routines were presented. As a result, the solution schemes exhibited 
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robust convergence behavior and similar iteration counts for both strain- 
and stress-based computations. Still, it remains an open question if 
materials without a lower bound on their stiffness, e.g. in case of perfect 
elastoplasticity, can be handled in the dual setting. 

Comparing the performance of the solvers, the Newton-CG method 
exhibited the best results throughout. However, in the dual setting, 
the Barzilai-Borwein scheme was in many cases competitive, being 
only slightly slower. For the presented numerical examples, we out- 
lined how the developed methods can be used to characterize the 
complex behavior of polycrystalline compounds. The low memory 
footprint and the high computational efficiency of the stress-based 
Barzilai-Borwein method enable the further study of more complex 
microstructures which require larger cells and higher voxel counts. For 
instance, the investigation of non-unidirectional fiber distributions for 
the NiAl-Mo material in Sec. 4.6.3 or the study of cellular lamellar 
structures formed by NiAl-Cr(Mo) (Wang et al., 2018b) becomes feasible 
with the presented approach. 
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Chapter 5 


Computing the effective response 
of heterogeneous materials with 
thermomechanically coupled 
constituents by an implicit 
FFI-based approach’ 


5.1 Introduction 


When subjected to a wide range of thermomechanical loadings, the 
interplay between temperature and deformation fields has a signifi- 
cant impact on the effective behavior of structural materials. Clearly, 
variations in temperature lead to changes in the mechanical behavior, 
e.g., in the form of thermal softening. In return, mechanical load- 
ings can induce temperature changes, due to internal dissipation or 
changes in entropy. This interplay of mechanical and thermal effects is 
governed by the balance equations for linear momentum and internal 
energy (in terms of the heat equation). For instance, in the vicinity of 


1 This chapter is based on Wicht et al. (2021b). For the sake of a coherent structure, 
formatting and typography of this thesis, minor changes have been made. To avoid 
redundancies in the text, the introduction has been shortened. References to the 
appendix of the original paper have been replaced with references to Sec. 3.3 and 
Sec. 4.5 which cover similar content. 
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their glass transition temperature polymers are particularly sensitive 
to temperature variations (Ferry, 1980). Especially when subjected to 
cyclic loading, self-heating due to dissipation can critically affect the 
mechanical properties of materials and the life time of components, see, 
e.g., Rittel (2000), Mortazavian and Fatemi (2015) or Katunin (2019). 


Thus, for the optimal use of materials, characterizing and predicting 
their thermomechanical behavior is of central importance. For composite 
materials, this proves to be a challenging task as their properties rest 
on their individual constituents and microstructure. In a small-strain 
framework, Chatzigeorgiou et al. (2016) used an asymptotic homog- 
enization approach to derive the governing thermomechanical equa- 
tions on the micro- and macroscale for generalized standard materials 
Germain et al. (1983), taking into account both the microstructure and 
the thermomechanical material behavior. This generalized previous 
studies using asymptotic approaches, e.g., by Terada et al. (2010) for 
poro-thermoelasticity or Temizer (2012) for finite thermoelasticity. A 
recent review on the homogenization of dissipative materials was given 
by Charalambakis et al. (2018). 

A particular result of the asymptotic homogenization is that the micro- 
scopic balance of linear momentum depends only on the macroscopic 
temperature and is independent of temperature fluctuations on the 
microscale Chatzigeorgiou et al. (2016). In contrast to earlier works 
on the homogenization of thermomechanical material properties, e.g., 
by Willis (1981), the uniform temperature on the microscale is not an 
ad-hoc assumption but arises as a direct consequence of first-order 
homogenization. As a result, the thermomechanical problem on the 
microscale may be solved for a homogeneous temperature and is de- 
coupled from microscopic heat-conduction. Based on these results, 
Tikkarrouchine et al. (2019) homogenized unidirectional short-fiber 
structures with temperature-independent material parameters in the 
context of concurrent multiscale simulations, using the finite element 
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(FE)-software ABAQUS. Similar FE-based multiscale studies which 
still consider thermal conduction on the microscale were carried out 
by Ozdemir et al. (2008) for elastoplasticity and Li et al. (2019) for 
single-crystal elasto-viscoplasticity. 

Motivated by the aforementioned studies, we consider solvers based on 
the fast Fourier transform (FFT) for the computational homogenization 
of thermomechanically coupled materials on the microscale. In this 
context, FFT-based methods have been used to homogenize linear ther- 
moelastic materials (Vinogradov and Milton, 2008; Anglin et al., 2014; 
Ambos et al., 2015) and linear thermo-magneto-electroelastic materials 
(Sixto-Camacho et al., 2013). Shantraj et al. (2019) proposed a FFT-based 
staggered algorithm for coupled multi-physics problems, taking thermal 
conduction on the microscale into account. 

To exploit the power of FFT-based methods for computing the effective 
thermomechanical behavior of nonlinear dissipative materials, we rely 
upon the framework of asymptotic homogenization, as pioneered by 
Chatzigeorgiou et al. (2016). Due to the weak coupling of mechanics and 
thermal conduction, the cell problem on the microscale is governed only 
by the microscopic balance of linear momentum and the evolution of 
the macroscopic temperature, see Sec. 5.2. Based on these results, we 
propose a staggered solution algorithm, where strain-field and tempera- 
ture are updated in an alternating fashion, see Sec. 5.3. The proposed 
solution scheme may be applied on top of any iterative strain-based 
solution method and can be easily integrated into existing FFT-based 
computational micromechanics codes. Owing to the homogeneity of the 
temperature on the microscale, the temperature update only involves 
solving a scalar equation and introduces little overhead. The usefulness 
of the approach is demonstrated in Sec. 5.4 for glass-fiber reinforced 
polypropylene composites with strong thermomechanical coupling. 
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5.2 First order homogenization of thermome- 
chanical composites 


Chatzigeorgiou et al. (2016) introduced a framework for the asymptotic 
homogenization of thermomechanically coupled generalized standard 
materials in the quasi-static small-strain setting. As a result, they ob- 
tained governing equations for macro- and microscale. In the following, 
we review the equations relevant for solving the thermomechanical cell 
problem on the microscopic level. 

Let Y C R’ be a rectangular cell, with microscopic point x € Y and 
d € {1,2,3} spatial dimensions. We denote by Sym(d) the space of 
symmetric d x d matrices. For the following discussion, we consider the 
displacement fluctuation field u : Y x [0, T] — R“, the infinitesimal strain 
field e: Y x [0,7] > Sym(d), the stress field o : Y x [0, T] > Sym(d), 
the heat flux q : Y x [0, T] — R4, the entropy density s : Y x [0, T] + R, 
internal energy density e : Y x [0, T] — R, internal variables z : Y x 
[0,7] > Z with a sufficiently large vector space Z and the macroscopic 
absolute temperature 9 € R>o. For a heterogeneous Helmholtz free 
energy density 


w:Y x Sym(d) x Rug X ZR, (2,€,0,2) = y(z,e,0,z), (5.1) 
which is related to the internal energy e by 
e=+4+86, (5.2) 


we express stress and entropy by the potential relations 


(.,€,0,2), and s=- (-,6,8,2), (5.3) 


_ 
= Je 


under the assumption that w is differentiable in all arguments except 
for the first (Coleman and Noll, 1963). As a result from the asymptotic 
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homogenization of Chatzigeorgiou et al. (2016), only the macroscopic 
temperature enters >. Thus, the temperature in a microstructure cell cor- 
responding to a macroscopic point can be interpreted as homogeneous. 
We assume that the free Helmholtz energy density can be additively 
decomposed 


w(-,€,9, z) = Wheat (- 9) + Umech(+; €, 9, 2) (5.4) 


into a component Yheat associated to heat storage and a component Wmech 
representing the storage of mechanical energy. This splitting does not 
reflect physics, but is computationally convenient, see Sec. 5.3. Many 
commonly used thermomechanical material models, such as viscoelas- 
ticity (Tikkarrouchine et al., 2019), elastoplasticity (Chatzigeorgiou et al., 
2016) and viscoplasticity (Stainier and Ortiz, 2010) feature a free energy 
in the form of (5.4). The heat capacity density at constant strain 


Ce = —0— (5.5) 


is typically assumed to be independent of strain £ and internal state z. 
Under this condition, the temperature dependence of the mechanical 
free energy mech May at most be linear. Consequently, we also partition 


the entropy 
s= Sheat(*, 0) T Smech(*; E, 0, z) (5.6) 
with 
Bu OWneat = Omech 
Sheat = — a0 and Smech = — Ey] . (5.7) 


For generalized standard materials, the evolution of internal variables is 
governed by Biot’s equation 


g 61892) + = (5.8) 
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involving a dissipation potential ¢ : Y x Ryo x Z > Rso, (2, 9,2) 
6(x,0,2). We assume that ¢ is convex in its third argument and 
6(-,6,0) = 0 holds. For the stress and strain field, the microscopic 
static balance of linear momentum without volume-force densities 


div o = 0, (5.9) 
and the kinematic compatibility condition 
e=E+V’u with E=(e), (5.10) 


hold, where (-), = 1/|Y| J,-(-) dV denotes the volume average over Y 
and V“ stands for the symmetrized gradient. The macroscopic tempera- 
ture is determined by the macroscopic balance of internal energy 


de=-divs (q)y +(o:é)y with z= (e)p, (5.11) 


where we neglect additional source terms and divz denotes the di- 
vergence with respect to the position z € Q in the macroscopic body 
NER. It is common to reformulate the balance of internal energy as a 
heat equation in terms of the entropy 


05 = —divz (q)y (3 2) with 35 = (s)y, (5.12) 
Y 


or the temperature 


oz DE WEN. ay ) 
c0 = -divz 0 ——: 0 =: 22). 
É SENE (SoH ey 4 (saa ) — $ 


(5.13) 
Note that, in the small-strain setting, the material time derivative C) 
reduces to the local time derivative a As we are only interested in 
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solving the cell problem, i.e., we only consider a single macroscopic 
point, the term —divz (q)}y cannot be further specified and acts as a volu- 
metric heat supply term. Hence, we denote S = —divz (q)y and treat S 
and € as boundary conditions. For a treatment in a concurrent multiscale 
context, see, e.g., Chatzigeorgiou et al. (2016) or Tikkarrouchine et al. 
(2019). 


5.3 Solution scheme for the fully-coupled ther- 
momechanical cell problem 


Consider the Hilbert space L?(Y;Sym(d)) of Y-periodic and square 
integrable stress and strain fields with inner product 


(S,T)+ (S,T)ı2= (S: T)y, S,T € L’(Y;Sym(d)), (5.14) 
and the induced norm 


ISl: = V45, Sr, Se L2(Y;Sym(d)). (5.15) 


For a certain point in time, we want to find a strain field e and a 
macroscopic temperature Ø which solve equations (5.8) - (5.11) for 
prescribed € and S. For the convenience of the reader, we restrict 
to pure strain boundary conditions, see Kabel et al. (2016) for an 
extension to mixed boundary conditions. To solve our problem, we 
consider a fixed time step and apply an implicit Euler discretiza- 
tion in time to our system of equations. We define the operator 
M : L?(Y;Sym(d)) x R>o > Hy'(Y;R*), where H,'(Y;R@) denotes 
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the space of forces, and the function H : L?(Y;Sym(d)) x R>o > R 


dp, 
JRA (5.16) 


H(e,6) = a“ 6) + Smech(*, EU 


— AtS + (Ft &,0,2)-(2- :")) (5.17) 
Y 


M(e, 0) = div 


with the mean entropy 3” and internal variables z” at the last converged 
time step and the time increment At. When evaluating M (e, 0) or H(c, 8), 
the internal variables z are computed by solving the discretized Biot’s 
equation 


Ow 7 Ob, =~ 2-2” 
3,0892) gO At ) 


=0, (5.18) 


for given strain-field £ and temperature 9. The thermomechanical cell 
problem is defined by the system of equations 


(5.19) 


0, 
0, (5.20) 
where (5.19) describes the mechanical problem for the strain-field e€ 
and (5.20) is the thermal problem, determining the evolution of the 
temperature 0. 

There exist two general approaches for solving the thermomechanically 
coupled problem. In monolithic schemes, (5.19) and (5.20) are solved 
simultaneously, whereas staggered approaches treat the sub-problems 
(5.19) and (5.20) separately (Armero and Simo, 1992; Rothe et al., 2015). 
Monolithic approaches enjoy unconditional stability, but the resulting 
system is usually non-symmetric (Armero and Simo, 1992). Provided 
each sub-problem by itself is symmetric, staggered schemes circumvent 
this difficulty and thereby enable using more efficient solution algo- 
rithms (Simo and Miehe, 1992; Riedlbauer et al., 2014). Furthermore, 
they are convenient in terms of implementation, as existing solvers for 
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the sub-problems may be used (Erbts and Diister, 2012; Martins et al., 
2017; Shantraj et al., 2019). Hence, we focus on staggered algorithms in 
the following. 
Typically, staggered schemes are based on an isothermal split (Simo and 
Miehe, 1992; Armero and Simo, 1992), where the mechanical problem 
is solved for a fixed temperature and the thermal problem is solved for 
a fixed strain-field. More precisely, for given iterates €y and p, where 
o = 0” is set to the temperature in the last converged time step, the 
following steps are performed: 
1. Solve M(e, 0p) = 0, with fixed temperature 9, and assign the solution 
to Ek+1.- 
2. Solve H(ex+1,9) = 0, with fixed strain-field &,;ı and assign the 
solution to 0441. 
In this context, we distinguish between explicit and implicit staggered 
schemes. For explicit schemes, steps 1 and 2 are carried out only once, 
whereas for implicit schemes the steps are repeated until a prescribed 
convergence criterion is fulfilled. 
Thus, explicit schemes are naturally faster. However, they suffer from 
lower accuracy (Vaz Jr. et al., 2011; Martins et al., 2017) and are prone to 
instabilities (Armero and Simo, 1992; 1993; Erbts and Diister, 2012) for 
problems with strong thermomechanical coupling. To address the latter 
difficulty, Armero and Simo (1992; 1993) proposed an unconditionally 
stable adiabatic split, where (5.19) is solved under the condition 5 = 0. 
In the present work, we do not follow this approach (see below for 
a discussion) and consider an implicit staggered approach with an 
isothermal split. Implicit staggered schemes enjoy the same accuracy as 
monolithic algorithms (Rothe et al., 2015) and have been shown to be 
more stable than explicit schemes (Erbts and Diister, 2012). However, 
when repeating steps 1 and 2 until convergence, the sub-problems (5.19) 
and (5.20) have to be solved multiple times per time step, which is 
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computationally expensive. Therefore, we propose two simplifications 
to enhance the overall efficiency of the scheme. 


First, suppose we have an iterative strain-based fixed point scheme 
Ek+1 = F (Ek, 9), (5.21) 


which solves (5.19) for a fixed temperature 6. For better readability, 
we suppress the possible dependency of f on additional algorithmic 
parameters and the boundary conditions €. Instead of solving (5.19) after 
each temperature update, we only perform a single iteration (5.21) of 
the mechanical solver. 

The second simplification concerns the temperature update, i.e., solving 
(5.20). Evaluating H (e, 0) is computationally expensive, as it involves 
solving (5.18) for all points in Y to compute the mechanical entropy 
Smech(*; Ek; Or, zp) and the dissipation Ov, Ek, Or, zk): (zk — z”), see (5.17). 
To obtain an efficient algorithm, we wish to avoid this operation outside 
of (5.21), i.e., without improving our current guess for the strain field. 
Thus, we propose an additive split of H (e, 0) 


Hee, 0) = Himpı (0) + Hepi(E, 0) (5.22) 


into an implicit part H;mpı(9) 


Himpı(0) = 8 (Sheat (+; 9))y ~~ 03" — AtS (5.23) 


and an explicit part Hexpi(€, 9) 


ae _ að _ 
Hexpi(E, 9) = O (Smecn(-,€, 9, 2) + (Peed z): (z= a) , (5.24) 
Y 
following our partition of the entropy. We emphasize that this split- 
ting is not physical but computationally convenient. Instead of solv- 


ing H(ex+1,0) = 0 for updating the temperature, we solve Himpi(9) + 
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Hexpi(€k, 9%) = 0. More precisely, we compute the effective mechanical 
entropy 
Smech,k = men (Sk, Ok, Zk) )y ’ (5.25) 


as well as the mean dissipation 


Dr = (Heena) (en 2")) (5.26) 
z Y 


as part of our mechanical iteration (5.21), see Miehe (1995). Subsequently, 


we solve 


Hepiit(9) =0 
l a _ _ _ (5.27) 
with Aspit(?) =0 (sheat, 0))y +¢ (Smech,k = 3”) =~ ALS + Dy. 


This is significantly more efficient than solving H(e,.+1,6) = 0, as it only 
involves the effective entropy related to heat storage, which is efficiently 
computed by 


N 
(Sheat(‘, 9)), = 5 Cj Sheat, j (0) (5.28) 
j=l 


for an N-phase composite material with volume fractions c; and phase- 

specific entropies Sheat,;- 

To summarize, our modified implicit algorithm involves the following 

steps, which are repeated until the convergence criterion of the mechani- 

cal solver is met: 

1. Update the strain field ex41 = F(éx, Ox) with a single iteration of 
the mechanical solver (5.21). Compute 3mech,k and D as part of the 
iteration. 

2. Solve Hypit() = 0 and assign the solution to 0441. 

The proposed algorithm is compatible to any mechanical solver in the 

form of (5.21), including classical FE-based methods. 
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For our concrete implementation, we rely on FFT-based solution 
schemes, due to their computational efficiency (Eisenlohr et al., 2013; 
Lucarini and Segurado, 2019; Rovinelli et al., 2020). In particular, 
we consider Moulinec-Suquet’s basic scheme (Moulinec and Suquet, 
1998), the Barzilai-Borwein method (Schneider, 2019a) and the inexact 
Newton-CG method (Kabel et al., 2014), see Sec. 3.3 and Sec. 4.5. 
Typically, the iteration scheme (5.21) involves applying the operator 
T = V*(div V‘)~'div in Fourier space and evaluating material law 
g= I (., 0,¢,z). As convergence criterion for the static equilibrium 


(5.19), we use 
IT : elz? < Ömech; (5.29) 
I (0y lz? 
see Sec. 5 of Schneider et al. (2019) for further details. The mean 
mechanical entropy 5mech, and the mean dissipation Dy are computed 
when evaluating the material law. For the temperature update we use 


Newton’s method, i.e., we iterate 
with 6° = 6,, (5.30) 


and 


piit (9) = ars, O) xs + (Sheat(*; 0): + Smech,k = 3”, (5.31) 


until the criterion 


Aspiit(9) 


E N Shs (5.32) 
9 (Sheat (+, 9)) y + 9 Smech,k = 


is met. Thus, we set Oat = 9’. If the convergence criterion (5.29) is met, 
we proceed to the next time step. Otherwise, we repeat updates (5.21) 
and (5.27). The algorithm is summarized in Alg. 5. 


Several remarks are in order: 
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Maxitheat, Öheat) 


1: Set initial values for 6 and € 

2 k-0 

3 Tmech + 1 

4: while k < maxitmech and rmech > Ömech dO 
5 k+k+1 


E / F(e, 0) 
; Smech Smech(: 56,9, z) = 
6: pl (38, ee 2")) > Isothermal step (5.21) 
Y 
Tmech IT: ollz2/||(e)y Ize 
7: Theat 7 1 
8: i40 


9: while i < maxitheat and Theat > Ôheat do > Temp. update (5.27) 
10: i i+l1 


11: H + 8 (Sheat(+,4))y, + 6 (Smech — 3") —AtS + D 
12: H' dl) + (Sheat(, O) Jy + Smech — 3” 
13: Theat = |H/(0 (Sheat(-, 9), + Ë Smecn)| 

14: 0 -0 -— H/H' 


15: end while 
16: end while 
17: return 9, £ 


1. For the temperature update, we use the entropy-based heat equation 
(5.12) instead of the more common temperature-based formulation 
(5.13). Consider the change in smech under the assumption that the 
heat capacity c., as defined in (5.5), depends only on the temperature. 
Using the implicit Euler time discretization on OSneat in (5.12) yields 


Sheat(0) = Skoat 


6 AT (5.33) 
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If, alternatively, we discretized the corresponding term ce (0)0 in (5.13), 


we obtain ee 
OSheat ‚m 9 — 0" 
50 (8) N (5.34) 


Apparently, the change in entropy is basically linearized. Hence, to 


0 


obtain higher precision for large time increments we prefer using 
(5.12). 

2. For the temperature update (5.27), we only consider the temperature 
dependency of the entropy related to heat storage Shear, Whereas Smech 
and D remain fixed. Indeed, if the heat capacity c- depends only 
on the temperature, Smech is temperature independent and D is at 
most a linear function of the temperature. Thus, as the strain field 
e converges, changes in subsequent iterates 0, become small. As a 
result, the solution of (5.27) approaches the solution of (5.20). 

3. Due to the homogeneity of the macroscopic temperature 0, computing 
the mean entropy (5.28) is comparatively inexpensive. Thus, solving 
the scalar equation (5.27) introduces no significant computational 
overhead. Compared to solving the isothermal mechanical problem, 
higher computation times may still arise for the thermomechanically 
coupled case, due to two factors. First, the iterative solver (5.21) may 
require more iterations to converge, depending on the thermomechan- 
ical coupling of the composite, i.e., the temperature dependence of the 
material laws and the magnitude of Smech and D. Secondly, evaluating 
Zmech and D may affect the overall runtime of the algorithm, provided 
the associated computational effort is similar to the evaluation of the 
material law and the T operator. 

4. Armero and Simo (1992; 1993) analyzed different operator splits for 
thermomechanically coupled problems and found that the explicit 
isothermal split, i.e., an isothermal mechanical step followed by 
a temperature update, is only conditionally stable. As an alterna- 
tive, they proposed the unconditionally stable adiabatic split, where 
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the mechanical problem is solved under the condition å = 0. For 
the present algorithm, we rely on the isothermal split as it is more 
convenient from the viewpoint of implementation. Suppose we 
already have an existing code for a purely mechanics-based solution 
scheme. For the isothermal split, only an update of the temperature 
dependent material parameters and the computation of smech and 
D have to be added to the already implemented material law. For 
the adiabatic split, on the other hand, simple reduced forms of the 
material law, which identically fulfill s = 0, can only be derived 
in special cases such as linear thermoelasticity Armero and Simo 
(1992). For more complex material laws with arbitrary temperature 
dependencies, the implementation of an additional adiabatic formu- 
lation may be cumbersome or even require an iterative local solution 
scheme. Thus, for tackling the issue of instability, we prefer using 
an implicit staggered approach based on an isothermal split (Erbts 
and Diister, 2012). Indeed, we encountered no numerical instabilities 
in our numerical experiments in Sec. 5.4, even for a composite with 
strong thermomechanical coupling. 


5.4 Numerical demonstrations 


5.4.1 Setup 


Alg. 5 for thermomechanically coupled problems was implemented in 


an in-house FFT-based computational homogenization code written in 
Python 3.7 with FFTW (Frigo and Johnson, 2005) bindings. Applying T 
and evaluating the material law were integrated as Cython extensions 
(Behnel et al., 2011) and parallelized using OpenMP. Throughout, we 
rely on the discretization by trigonometric polynomials introduced by 


Moulinec and Suquet (1998). As convergence criterion for the iterative 
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FFT-based solver, we use (5.29) 


WE calza < Ömech (5.35) 
Il (o)y Ir» 


with a prescribed tolerance of dmech = 10-5. The tolerance for the 
convergence criterion (5.32) of the temperature update 


Hspıir(O 
ne) ee a 6.36) 
0 (Sheatls 0)), + O Smech,k 
is set to ôheat = 1074. For the computations on the 2-dimensional 


microstructure in Sec. 5.4.2, a desktop computer with 32 GB RAM 
and a 6-core Intel i7-8700K CPU was used. The computations on the 
3-dimensional microstructure in Sec. 5.4.3 were performed on a worksta- 
tion with 512 GB RAM and two 12-core Intel Xeon(R) Gold 6146. 


5.4.2 Continuous glass-fiber reinforced polypropylene 


a 


Figure 5.1: Continuous glass-fiber reinforced polypropylene: Microstructure and schematic 
of the generalized Maxwell model for polypropylene 
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In the following example, we consider a composite consisting of a 
polypropylene matrix unidirectionally reinforced by continuous glass 
fibers with a volume fraction of 30%. The microstructure, see Fig. 5.1, is 
modeled as a two-dimensional periodic cell with a resolution of 512, 
containing 200 fibers. It was generated using the adaptive shrinking cell 
algorithm of Torquato and Jiao (2010). 


The glass fibers are modeled as an isotropic linear thermoelastic material. 
The free energy related to heat storage reads 


(0) = co | (0 be) -01m (3), (537) 


and corresponds to a material with a constant heat capacity ce (0) = co. 
Typically, for solids, states of constant strain are hard to realize under 
fluctuating temperatures. Hence, the heat capacity cs at constant strain 
is usually not measured experimentally. However, its value is typically 
close to the heat capacity c, at constant stress. The mechanical part of 
the free energy is given by 


1 
Wmech(€, 9) = > ce:C:e—e:C:(al0— het)), (5.38) 
implying the stress-strain relation 
a =C : (e —a(0 — Oret)) (5.39) 


with a stiffness tensor C and a thermal expansion tensor a € Sym(d). 
The associated entropies read 


0 
Sheat(0) = co In (=) and Smech(E) =E:C:a. (5.40) 
ref 


As the material is elastic, no energy is dissipated, i.e., D = 0, and the 
thermomechanical coupling is governed solely by smech. Changes in 
Smech Cause self-heating under hydrostatic compression and self-cooling 
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under hydrostatic extension. This phenomenon is commonly referred 
to as thermoelastic coupling effect, see Sec. 13.2 in Haupt (2002), or 
Gough-Joule effect, see Sec. 96 in Truesdell and Noll (2004). For the glass 
fibers, we assume that both stiffness tensor and thermal expansion are 


isotropic, i.e., 
C = 3KP;ı +2GP and a=aol, (5.41) 


with bulk modulus K, shear modulus G and isotropic coefficient of 
thermal expansion ag. By Pı and Pz we denote the projectors onto the 
spherical and deviatoric d x d matrices, respectively. The parameters 
of the model are taken from Tikkarrouchine et al. (2019) and listed in 
Tab. 5.1. 


Table 5.1: Material parameters of the glass fibers Tikkarrouchine et al. (2019) 


Thermal expansion ap = 9 x 107° 1/K 


Heat capacity co = 2.1 x 10° J/(m? K) 
Bulk modulus K = 50 GPa 
Shear modulus G = 28.6 GPa 


For the polypropylene matrix, we assume a linear thermoviscoelastic 
model based on a generalized Maxwell model, see Fig. 5.1 and Sec. 
3.5.1 in Tschoegl (1989). For models accounting for effects outside 
of the viscoelastic domain, we refer, e.g., to Krairi et al. (2019) and 
Benaarbia et al. (2019) for extensions to viscoplasticity and damage, and 
Tscharnuter et al. (2012) for a study on polypropylene. Based on the 
caloric data in Table 18.10 in the Springer Handbook of Materials Data 
(Warlimont and Martienssen, 2018), we assume a heat-storage related 
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free energy of the form 


ic (0) = co [(1 = Kgs) (0 — Ge) = 810 (3) ) = FOG?) 


corresponding to a linear heat capacity 
Ce(0) = coll + k(O — bref). (5.43) 


The energy stored the generalized Maxwell Model with Nmw Maxwell 
elements reads 


1 
Wmech(€, 9, Eva) = 3 er Co LEF 2 zE Eva) : Ca : (€ — Eva) 
Nuw 
— € : Co : (a(0 — %ee)) X (e Eva) : Ca : (a(0 — rer) 
a=l 
(5.44) 
Consequently, the stress computes as 
Nuw 
o = Co: (e-a(9 — Oet)) + I Ca: (e-Eva ad — Pret)) (5.45) 
a=1 


We assume that the viscosity tensor associated to a dashpot of the 
generalized Maxwell model has the form 


Va = a(9)TaCa, (5.46) 


where a : Ryo — R denotes a temperature-dependent shift function. The 
corresponding fluidity F is defined by the pseudoinverse 


one (5.47) 
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In terms of the partial stresses 
Ova = Ca : (€- Eva —Q(0 — Ber), (5.48) 
the evolution equation for the viscous strains reads 
Eva = Eat Ovas (5.49) 


For simplicity, we assume that polypropylene is isotropic and linear 
elastic in dilation, see Sec. 9.4 in Brinson and Brinson (2015). More 
precisely, the stiffness tensors and the thermal expansion have the form 


Co = 3KoPı + 2GoP2, Ca = 2GaP2 and a= aol. (5.50) 


In this particular case, the viscous strains e,. are purely deviatoric and 
independent of thermal expansion. The shift function a describes the 
time-temperature dependency of the material. At room temperature 
fref = 293.15 K, polypropylene is above its glass transition temperature 
glass © 273.15K. Hence, we use the Williams-Landel-Ferry (WLF) 
equation (Williams et al., 1955) as ansatz for the shift function 


Cı (0 — Bref) 


-M 5.51 
Cy +9 — Dre ( ) 


logio a(0) = 
For the present study, we restrict to linear viscoelastic behavior and focus 
on the effects induced by the thermomechanical coupling. In particular, 
we omit a possible pressure dependence of the shift factor as suggested 
by Fillers and Tschoegl (1977) based on free-volume considerations. 
For our implementation, we use the time-integration scheme of Taylor 
et al. (1970), which is based on the partial stresses ova instead of £va. The 
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update reads 


: (e-—e” +a(0 — 0")) 
(5.52) 


where (-)” denotes the value of the last converged time step and £ is a 
reduced time defined via 


| 
¿= f "OOM (5.53) 


We compute the change in reduced time A& = € — €" by a 5-point Gauss 
quadrature, assuming a constant temperature rate. The entropies and 
dissipation in terms of the partial stresses oy. read 


0 
Sheat (8) = Co a = KO ref) In (+) ale: k(0 u in) ’ (5.54) 
re: 
Nmw 
Smech(€,9, Ova) =E:Co:at 5 a: [Ova + Ca : (a(0 — Brer))]; (5.55) 
a=1 
Nuw 
D= 5 Oven? Meet Oye (5.56) 
aa 


The used material parameters are listed in Tab. 5.2. The caloric pa- 
rameters were chosen based on Tables 18.9 and 18.10 in the Springer 
Handbook of Materials Data (Warlimont and Martienssen, 2018) and the 
viscoelastic parameters are taken from the experimental study by Kehrer 
et al. (2018). Note that Kehrer et al. (2018) characterized the behavior of 
polypropylene over a wide range of frequencies and temperatures, using 
27 Maxwell elements for their model. For the present study, we restrict 
to moderate temperature and frequency changes and only consider 
9 elements with time constants Ta € [10~*,104] in order to reduce 
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computation times. The shear moduli of the elements with 7, > 10* 
are added to the elastic shear modulus Go, whereas the elements with 
Ta < 1074 were omitted. 


Table 5.2: Material parameters of polypropylene (Kehrer et al., 2018; Warlimont and 
Martienssen, 2018) 


Thermal expansion ao = 1.91 x 10 ?1/K 


Heat capacity Co = 1.512 x 10° J/(m? K) 
k = 4.6 x 1073 

WLE constants Ci = 45 
Cy = 158K 

Bulk modulus Ko = 4930 MPa 


Shear modulus 


Go = 415.6 MPa 


Maxwell elements 7 = 1074s Gi = 154.8 MPa 


Ta = 1078s Ga = 127.0 MPa 
T3 = 1072 s G3 = 97.6 MPa 
tm =10-'s Gy = 72.3 MPa 
Ts = Is Gs = 50.9 MPa 
Te = 10s Gs = 38.4 MPa 
T7 = 10? S G7 = 36.7 MPa 
Tg = 108s Gs = 31.5 MPa 
Tg = 104s Gg = 30.6 MPa 


Uniaxial extension. In our first set of experiments, we take a look at 
the stress-strain behavior under uniaxial extension and compression. 
We want to assess the strength of the thermomechanical coupling for 
the investigated composite microstructure. Furthermore, we are inter- 
ested in the performance of different FFT-based solution algorithms in 
conjunction with the staggered thermomechanical solution scheme in 
Alg. 5. To this end, we chose the Barzilai-Borwein method (Schneider, 
2019a) and the Newton-CG method (Gélébart and Mondon-Cancel, 
2013; Kabel et al., 2014) as fastest strain-based solvers, see Ch. 3. In 
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addition, the basic scheme by Moulinec and Suquet (1998) is included 
as classical benchmark. For the loading, we apply mixed boundary 
conditions, see Kabel et al. (2016), corresponding to strain-controlled 
uniaxial extension/compression to 5% with a strain rate of 1/s at various 
loading angles in the xz-plane with respect to the x-direction, i.e., the 
fiber direction. For the first set of computations, we consider adiabatic 
conditions, i.e., S = 0, where self-heating /-cooling of the material is 
expected. The second set of computations is performed with a fixed 
temperature of fref = 293.15 K as reference. 
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— adiabatic compression —— adiabatic tension —— isothermal tension 
300 }- | 
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=] 
a 
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Figure 5.2: Continuous glass-fiber reinforced polypropylene: Stress vs strain at various 
loading angles in the xz-plane with respect to the «x-direction 


The resulting stress-strain curves are plotted in Fig. 5.2. In the isothermal 
setting, there is no distinction between tension and compression and we 
observe a linear relation between stresses and strains. For the loading 
under adiabatic conditions, however, the thermomechanical coupling 
induces an effectively nonlinear behavior. To be more precise, under 
compression, Smech decreases, which leads to a rise in temperature, 
resulting in the softening of the polypropylene matrix. Conversely, under 
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tension Smech increases, leading to a lower temperature and the stiffening 
of polypropylene. Two factors contribute to the strength of the observed 
thermomechanical coupling. First, due to its high thermal expansion 
coefficient a, the Gough-Joule effect, i.e., the strain-induced change of 
Smech is rather pronounced for polypropylene. Secondly, the mechanical 
behavior of polypropylene is very sensitive to temperature changes in 
the vicinity of its glass transition temperature, as encapsulated by the 
WLE equation (5.51). Note that the computations for the 0° load angle 
represent an exception to these observations. In this case, the fibers 
carry most of the load and we observe no difference between isothermal 
and adiabatic computations, due to temperature independence of their 
stiffness. 

Performance comparison for a single load step. Next, we take a closer 
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Iter. count Time in s 


(a) Residual vs iteration count (b) Residual vs computation time 


Figure 5.3: Continuous glass-fiber reinforced polypropylene: Performance comparison for 
5% uniaxial extension in z-direction in a single load step 


look at the performance of the different FFT-based solution schemes. In 
particular, we are interested how their convergence behavior changes in 
case of strong thermomechanical coupling, compared to the isothermal 
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setting. Hence, we consider the load case of uniaxial extension at a 
90° load angle, where the coupling is most pronounced. First, the 
performance is evaluated for a single load step up to 5% strain. 


The residual is plotted as a function of iteration counts and computation 
time in Fig. 5.3. Note that the convergence behavior of the Newton-CG 
method in the adiabatic setting is distinctly different in comparison to 
the isothermal computation. For the isothermal case, the decrease of 
the residual gradually grows in subsequent Newton iterations. Due 
to the adaptive forcing-term choice of Eisenstat and Walker (1996), 
the linear system is thus solved to higher accuracy. In contrast, the 
convergence rate with respect to Newton iterations is roughly constant 
for the adiabatic computation. This is due to the fact that we do not 
consider the temperature dependence of the material behavior in the 
computation of the Hessian. Thus, the linear approximation of the 
gradient is less precise than for the isothermal computation. With respect 
to the overall performance, this effect is somewhat alleviated by the 
forcing-term of Eisenstat and Walker (1996), as the linear system is solved 
to lower accuracy, thereby reducing the cost of each Newton iteration. 
Even though, Newton-CG requires 75% more Newton iterations in the 
adiabatic setting, the runtime only increases by about 30%, see Tab 5.3. 
For the basic scheme, the convergence rate in the adiabatic and isother- 
mal setting is nearly identical. The same is true for the Barzilai-Borwein 
method, which displays its characteristic non-monotone behavior and 
converges in much fewer iterations than the basic scheme, see Tab. 5.3. 
Even though the iteration counts of both schemes are roughly identical 
for both settings, the overall computation times are slightly higher for 
the adiabatic computations. 

A look at the computational cost of the most expensive operations, i.e., 
the material law, the FFTs and the I'-operator, clarifies this phenomenon. 
In Tab. 5.4, the average computation times per application of these 
operations are listed for the 0° load case solved by the Barzilai-Borwein 
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Table 5.3: Continuous glass-fiber reinforced polypropylene: Iteration counts and computa- 
tion times for 5% uniaxial extension in z-direction in a single load step 


isothermal adiabatic 


Basic scheme Iter. 349 
Comp. timeins 9.34 

Barzilai-Borwein _ Iter. 60 
Comp. timeins 1.60 

Newton-CG Newton iter. 8 
CG iter. 55 


Comp. timeins 1.80 


323 
10.04 
59 
2.01 
14 

64 
2.38 


scheme. Notably, for the adiabatic setting, the additional computation 


of D and smech in the material law increases its time per application by 
about 70%. Thus, the overall cost per iteration ends up 30% higher. The 


same is true for the basic scheme. 


Table 5.4: Continuous glass-fiber reinforced polypropylene: Computation time per 
application of the most expensive operations for loading in z-direction and solved by the 


Barzilai-Borwein method in a single load step 


Mean comp. time 


ne aaa isothermal adiabatic 
per application in ms 
Material law 9.6 16.2 
FFT 9.7 9.6 
T° operator 2.4 2.5 


Comparing the overall performance of the schemes, we observe that 
the Barzilai-Borwein method is the fastest for both the isothermal and 


adiabatic setting. The Newton-CG method is only slightly slower but 


suffers from increased iteration counts for the adiabatic case. The basic 
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scheme is by far the slowest, taking 5-6 times longer than the Barzilai- 
Borwein method. 


Performance comparison for 20 load steps. Next, we investigate the 
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Figure 5.4: Continuous glass-fiber reinforced polypropylene: Performance comparison for 
5% uniaxial extension in z-direction in 20 load steps 


performance of the solvers, when subdividing the strain loading of 
5% into 20 equally spaced load steps. An affine-linear extrapolation 
Moulinec and Suquet (1998) is applied at the beginning of each load 
step to obtain an initial guess for the strain field. The total iteration 
counts and computation times of the different solvers in each load step 
are plotted in Fig. 5.4. 

For all solvers, the iteration counts decrease up to step 5 as the affine- 
linear extrapolation takes effect. For the isothermal computations, the 
iteration counts further decrease after this point, due to the linear stress- 
strain behavior. In contrast, the iteration counts stagnate or, in case of 
the basic scheme, even increase for the adiabatic computations. This 
coincides with the onset of the effectively nonlinear material behavior 
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for uniaxial strains larger than 1%, see Fig. 5.2d. Hence, the affine-linear 
extrapolation becomes less effective, which leads to higher iteration 
counts compared to the isothermal computations. Note that for a mate- 
rial which already behaves nonlinearly under isothermal conditions, this 
difference between adiabatic and isothermal computations is expected 
to be less pronounced. 


Table 5.5: Continuous glass-fiber reinforced polypropylene: Mean iteration counts and 
computation times for 5% uniaxial extension in z-direction and 20 load steps 


isothermal adiabatic 


Basic scheme Mean iter. 87.80 87.05 
Mean Comp. timeins 2.23 2.87 

Barzilai-Borwein Mean iter. 17.45 22.00 
Mean Comp. timeins 0.58 0.79 

Newton-CG Mean Newton iter. 5.15 7.70 
Mean GG iter. 15.40 23.15 
Mean Comp. timeins 0.61 0.97 


As for the loading in a single step, the Barzilai-Borwein method is fastest. 
Its computation time for the adiabatic case increases by roughly 35%, 
due to higher iteration counts and the additional cost per iteration. 
The performance of the Newton-CG method is nearly identical to the 
Barzilai-Borwein method for the isothermal computation. However, it 
exhibits a larger decrease in performance for the adiabatic computation, 
with an increase in computation time by nearly 60%. For the basic 
scheme, the iteration counts are roughly identical for the isothermal and 
thecance adiabatic setting. In the first 9 steps, the adiabatic computation 
converges faster, as a consequence of the stiffening due to self-cooling 
and the resulting reduction in material contrast. For the subsequent 
steps, the isothermal computation requires fewer iterations, due to the 
more effective affine-linear extrapolation. Fortuitously, these effects 
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roughly cancel each other out. Overall, the basic scheme is still the 
slowest, taking 3 — 4 times longer than the Barzilai-Borwein method to 


converge. 


To summarize, we observe that the convergence behavior of the basic 
scheme and the Barzilai-Borwein method in conjunction with Alg. 5 
is similar to their convergence behavior under isothermal conditions, 
even for a composite with strong thermomechanical coupling. The 
computation times for the thermomechanically coupled computations 
increase by roughly 30% for both schemes, which is mainly due to 
the additional cost of computing the dissipation D and mechanical 
entropy Smech in the material law. The Newton-CG method suffered 
the highest decrease in performance for the coupled computations, as 
the temperature dependence is neglected in the Hessian computation. 
This leads to a significant increase in Newton- and CG-iterations, in 
addition to the higher cost per Newton iteration. 

Considering the overall performance, the Barzilai-Borwein method 
and the Newton-CG method are the fastest solvers. Due to its lower 
memory requirements and its more robust convergence behavior in the 
thermomechanically coupled computations, we use the Barzilai-Borwein 
method for all following computations. 


5.4.3 Planar short glass-fiber reinforced polypropylene 


Motivated by the numerical experiments in the last section, we in- 
vestigate a more complex microstructure, see Fig. 5.5. We consider 
a polypropylene matrix reinforced by 1130 short glass-fibers with an 
aspect ratio of 20. The fiber volume fraction amounts to 13.2%, corre- 
sponding to mass fraction of 30%. The microstructure was generated by 
the sequential addition and migration algorithm (Schneider, 2017b) and 
discretized by 512 x 512 x 64 voxels. The second-order fiber-orientation 
tensor reads A = diag(0.45,0.45,0.1), see Advani and Tucker (1987). 
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Figure 5.5: Short glass-fiber reinforced polypropylene: Microstructure and von Mises 
equivalent strain after 1% uniaxial extension in «x-direction 


For the following investigations, we use the same material models and 
parameters as in Sec. 5.4.2. 
Dynamic mechanical analysis. The macroscopic behavior of viscoelas- 
tic composites is often investigated under steady-state oscillations with 
a fixed frequency f € R>o, see Sec. 5.5 in Brinson and Brinson (2015). 
This is commonly called dynamic-mechanical analysis (DMA). Suppose 
a linear viscoelastic material is harmonically excited by uniaxial ten- 
sion/compression where the strain component in loading direction is 
given by 

e(t) = Eamp Sin (wt), (5.57) 


with the strain amplitude camp € R>o and the angular frequency 
w = 2r f. The stress response of the material in loading direction reads 


a(t) = Camp sin(wt + 6) (5.58) 
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with the stress amplitude camp € R>o and phase difference ô € [0, 7/2]. 
Typical characteristics for the material are the storage modulus 


E' = Z cos(ô) (5.59) 
Eamp 
and loss modulus 
Oam 
E" = = sin(6). (5.60) 
Eamp 


The storage modulus is related to the average elastic energy stored in a 
load cycle 


1 
Weycle = 7 Eamp E (5.61) 


and serves as a measure of the material’s elastic stiffness. The loss 
modulus is proportional to the energy dissipated over a load cycle 


Deycle = T Emp E”, (5.62) 


see Sec. 9.1 in Tschoegl (1989). Thus, E” is of particular interest in cases 
of harmonic loadings with high cycle counts. For instance in fatigue 
experiments (Handa et al., 1999; Esmaeillou et al., 2012), the dissipated 
energy accumulates, leading to an increase of temperature over time. 
For linear viscoelastic material models, such as the generalized Maxwell 
model for polypropylene, E’ and E” can be computed analytically in the 
isothermal setting, see Sec. 11.1 in Tschoegl (1989). However, as we have 
seen in Sec. 5.4.2, the thermomechanical coupling induces a nonlinear 
behavior due to self-heating and self-cooling. Thus, we characterize the 
viscoelastic behavior of the composite by simulating DMA tests. More 
precisely, we run through the following steps: 
1. In analogy to tensile DMA experiments, a static uniaxial tensile load 
of Estatic is applied in a single step in 1 second. 
2. The loading €static is held constant for 100 seconds. In actual experi- 
ments, the holding time is usually much shorter. However, we want 
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to mitigate the effects of the initial stress relaxation on our numerical 
experiments. 

3. A sinusoidal loading of the form (5.57) with amplitude Eamp and 
frequency f is applied over two cycles, resolved with a fixed number 
of load steps per cycle. 

4. The amplitude camp and the phase angle 6 of the stress response (5.58) 
are determined via a least-square fit to the computed macroscopic 
stress values in the second cycle to avoid transient effects. Subse- 
quently, E’ and E” are computed via equations (5.59) and (5.60). 

Note that we do not use the affine-linear extrapolation for these com- 

putations, due to the nonmonotone loading. To validate our approach 

and to determine the necessary number of load steps per cycle, we 
apply steps 1-4 for a homogeneous polypropylene microstructure under 
isothermal conditions. The parameters for the sinusoidal loading are 

Estatic = 0.1%, Eamp = 0.05%, see Kehrer et al. (2018), and f = 10 Hz. For 

this frequency, the storage and loss modulus of the viscoelastic model for 

polypropylene are given by E’ = 2012.22 MPa and E” = 177.68 MPa. In 

addition to E’ and E”, we also track the effective dissipated energy (5.26) 

in our computations and compare it to the analytical formula (5.62). The 

relative errors for E’, E” and D.ycıe are shown in Fig. 5.6 as a function of 
the load steps per cycle. 

For more than 30 load steps per cycle, the relative error for all tracked 

quantities falls below 1%. Indeed, Æ” as determined by our DMA 

computation virtually coincides with its analytical value. Note that 
the error in dissipation does not tend to 0 for finer resolutions. This 
is a consequence of the stress relaxation under static strain loading, 
which still causes a small additional amount of energy dissipation. In 
preliminary computations, a higher number of cycles was considered as 
well. However, the results did not differ substantially. Hence, we choose 
30 load steps per cycle for all subsequent computations. 
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Figure 5.6: Polypropylene: Relative error between analytic values and the results of the 
virtual DMA tests for E’, E” and D.ycıe as a function of load steps per cycle 


With the established procedure, we simulate uniaxial DMA tests at 
various static load values for the planar short glass-fiber reinforced 
polypropylene microstructure, see Fig. 5.5. In particular, the effect of the 
thermomechanical coupling under adiabatic conditions on E’ and E” 
is of interest. The loading is applied in the xz-plane at angles between 
0° — 90° with respect to the z-direction. Static loads éstatic between 0.1% 
and 1.0% are considered. The amplitude and frequency are fixed at 
Eamp = 0.05% and f = 10 Hz, respectively. 

In Fig. 5.7, the results for E’ and E” are plotted alongside the mean 
temperature during the harmonic excitation as a function of the loading 
angle. First, we take a look at the storage modulus. For the 0° load 
case, i.e., in-plane loading, the storage modulus is at its peak value. This 
is due to the stiffening effect of the fibers. For increasing load angle, 
it drops by ca. 20% up to 45° and subsequently stagnates. Similar to 
the observations in Sec. 5.4.2, the material cools down under tensile 
loading due to the Gough-Joule effect, see Fig. 5.7c. This causes a 
stiffening of the polypropylene matrix and an increase E’. The effect is 
most pronounced for the 90° load case, where we observe the largest 
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Figure 5.7: Short glass-fiber reinforced polypropylene: Complex moduli and average 
temperature as a function of the loading angle with respect to the x-axis in the xz-plane 


temperature difference between adiabatic and isothermal conditions. For 
a static load of 1.0%, the relative error between the adiabatic computation 
and the isothermal computation is slightly below 6%. 

The loss modulus E” displays a slightly different profile with respect 
to the loading angle. Its value is at its maximum between 0° — 15° 
case, where the strong strain localization around the fibers leads to 
strong dissipation. Subsequently, the loss modulus decreases linearly. 
The effect of the static loading on the loss modulus under adiabatic 
conditions is more pronounced than for the storage modulus. As a 
decrease in temperature brings the temperature of polypropylene closer 
to its glass transition temperature, the dissipated energy and E” increase. 
At a load angle of 90°, where the self-cooling is most pronounced, even 
the lowest static loading of éstatic = 0.1% leads to a 5% difference in the 
loss moduli. The difference increases with the static loading, reaching 
13% for Estatic = 1.0%. 

We conclude that the thermomechanical coupling can have a signifi- 
cant effect when characterizing thermoplastics-based composites using 
DMA. Due to the Gough-Joule effect, the effective behavior of the 
material, in particular E”, becomes load dependent, i.e., nonlinear. This 
is particularly pronounced for high loading frequencies, when there 
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is no time for thermal conduction or radiation to take place and the 
conditions are approximately adiabatic. To obtain precise results for 
real-life experiments, a strict temperature control of the specimen and 
low static loadings are therefore necessary. 


Self-heating under harmonic loading. In the previous Sec. 5.4.3, we 
considered an oscillatory loading with a small number of cycles. In this 
case, the observed temperature changes were mostly due to the Gough- 
Joule effect caused by the static loading. However, for a high number 
of cycles, the dissipated energy accumulates over time and becomes the 
main driver of the temperature evolution. For example, such conditions 
frequently occur in fatigue testing, where the self-heating of the specimen 
poses a major challenge (Rittel, 2000; Mortazavian and Fatemi, 2015). 
Typically, in the first hundreds of cycles, the temperature increases in a 
roughly linear fashion (Jegou et al., 2013) and subsequently reaches an 
equilibrium value when dissipation and thermal conduction reach an 
equilibrium state. This limits, for instance, the range of viable loading 
frequencies for testing (Jia and Kagan, 1998; De Monte et al., 2010). 

Motivated by these findings, we take a look at the effect of the ther- 
momechanical coupling on the dissipative characteristics of the short 
glass-fiber reinforced composite in the initial stage of a high cycle test. 
More precisely, we prescribe 100 cycles of harmonic stress-controlled 
uniaxial tensile loading in x-direction with a frequency of f = 10Hz. 
The static stress is fixed at Ostatic = 30 MPa with a stress amplitude of 
Camp = 30 MPa, corresponding to a load factor of R = Omin/O'max = 0. As 
only a short time-frame of 10 seconds is considered, we assume adiabatic 
conditions. First, we consider the evolution of the temperature and the 
strain amplitude. In Fig. 5.8a, the minimum, maximum and average 
temperature are plotted for each cycle. Initially, the mean temperature is 
lower than the reference fref = 293.15 K, due to the Gough-Joule effect. 
Over time, the self-heating caused by the dissipated energy leads to a 
linear increase and after 25 cycles the initial cool-down is compensated. 
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Figure 5.8: Short glass-fiber reinforced polypropylene: Temperature and strain amplitude 
for each of 100 cycles under stress-controlled uniaxial harmonic loading in z-direction 


Together with the temperature, the strain amplitude increases as the 
material softens, see Fig. 5.8b. However, the reference value of Eamp for 
the isothermal case is reached after 42 cycles when the mean temperature 
has already surpassed d,er. Taking a look at the minimum and maximum 
temperature in Fig. 5.8a, we observe that the large stress amplitude 
leads to a significant fluctuation of about 1 K for each cycle. Hence, the 
behavior of polypropylene fluctuates within each cycle, resulting in a 
slight reduction of the amplitude. 

Last but not least, we take a look at the dissipation and the loss mod- 
ulus for each cycle. Consistent with our observations in Sec. 5.4.3, the 
magnitude of the loss modulus, see Fig 5.9a, is initially higher than 
the isothermal prediction and subsequently decreases with increasing 
temperature. As the temperature reaches its reference value, so does E”, 
indicating that it is mostly unaffected by the large stress amplitude and 
the resulting intercyclic temperature fluctuations. The dissipation per 
cycle follows a similar trend. However, it barely exceeds the isothermal 
reference value in the first few cycles, as the higher loss modulus is 
partly compensated by the lower strain amplitude. 
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Figure 5.9: Short glass-fiber reinforced polypropylene: Dissipated energy and loss modulus 
for each of 100 cycles under stress-controlled uniaxial harmonic loading in z-direction 


Overall, we observe that the dissipative behavior of the material changes 
significantly in the first cycles of a long-term harmonic excitation. At 
the end of 100 cycles, the loss modulus and dissipation are 16% and 
12% lower, respectively, than the values predicted for the isothermal 
setting. Thus, when predicting the temperature changes for fatigue tests 
based on (5.62), see Handa et al. (1999), accounting for the temperature 
dependence of the material is mandatory. 


5.5 Conclusions 


The present study was devoted to enabling the efficient computational 
homogenization of thermomechanically coupled materials. Based on the 
asymptotic homogenization framework for dissipative materials (Chatzi- 
georgiou et al., 2016), we presented an efficient staggered algorithm 
compatible to strain or displacement-based micromechanical solvers. 
Due to their computational power, we focused on FFT-based solution 
schemes and found that best performance was achieved in combination 
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with the Barzilai-Borwein method. Even for a composite with strong 
thermomechanical coupling, its iteration counts and convergence behav- 
ior hardly differed from the usual isothermal setting. The powerful 
class of polarization-based schemes (Eyre and Milton, 1999; Michel 
et al., 2001; Monchiet and Bonnet, 2012) was excluded from the present 
work, as the complexity-reduction approach by Schneider et al. (2019) 
may prevent the evaluation of dissipation and entropy. Further studies 
are necessary, to make these solvers available for thermomechanically 
coupled problems. 


In our numerical experiments, we observed that the computational 
overhead for the temperature-update step in the proposed algorithm 
was negligible. The difference in runtime between thermomechanically 
coupled and isothermal computations was dominated by evaluating the 
entropy and the dissipation, as part of the material law. In particular, 
computing the dissipation was costly for the chosen linear viscoelas- 
tic model, as it involves applying an inverse stiffness tensor for each 
Maxwell element. This lead to an increase in overall computation times 
by 20 — 30%. However, for material laws such as J2-plasticity, where 
the dissipation is readily computed, the difference is much smaller. 
Overall, we conclude that the proposed algorithm enables computing the 
effective mechanical behavior of thermomechanically coupled materials 
with nearly the same computational efficiency as traditional FFT-based 
methods in an isothermal setting. 

For the investigated glass-fiber reinforced polypropylene composites 
we observed that the thermomechanical coupling induced an effectively 
nonlinear material behavior, even though the underlying material model 
was linear viscoelastic. In particular, the dissipative characteristics of 
the materials changed significantly between the isothermal and adi- 
abatic computations. Expanding the study of similar polymer-based 
lightweight-materials, such as sheet-molding compounds (Gorthofer 
et al., 2020), to include thermal effects seems promising. The presented 
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thermomechanical solver is compatible to the interpolation approach by 
Köbler et al. (2018), enabling the development of effective (macroscopic) 
surrogate models for arbitrary fiber orientations. For more general struc- 
tures and material models, thermomechanical FFT-based computations 
may enter data-driven approaches, such as deep material networks 
(Liu et al., 2019; Liu and Wu, 2019; Gajek et al., 2020), to facilitate the 
simulation of components on the macroscale. 

With regard to the material model of the polymer, it would be interesting 
to apply a free-volume based approach for the shift factor (Fillers and 
Ischoegl, 1977), which takes into account the pressure dependence of 
the viscosity. Whereas a tensile loading mechanically increases the free 
volume, the accompanying adiabatic cooldown, observed in this study, 
may weaken this effect. Investigating the interaction between these 
phenomena seems worthwhile to enable a thorough characterization of 
the thermomechanical material behavior. In addition, expanding the 
material model to the viscoplastic domain Krairi et al. (2019) appears 
attractive to investigate the influence of the plastic dissipation on the self- 
heating behavior of the material. As self-heating effects are particularly 
relevant in the context of fatigue and life-time predictions, coupling the 
presented thermomechanical solver with FFT-based schemes for damage 
(Boeff et al., 2015; Sharma et al., 2020) or fracture (Chen et al., 2019b; 
Ernesti et al., 2020) would be of interest. 
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Anderson-accelerated polarization 
schemes for fast Fourier 
transform-based computational 
homogenization’ 


6.1 Introduction 


Polarization-based methods pioneered by Eyre and Milton (1999), con- 
stitute a powerful and memory efficient class of solvers, oftentimes 
outperforming the fastest strain-based methods, see Sec. 7 in Schneider 
et al. (2019). Unfortunately, these algorithms are highly sensitive to the 
choice of algorithmic parameters, limiting their capabilities as general- 
purpose solvers. In particular for problems with infinite contrast, e.g., 
porous materials, where the strong convexity constant is generally 
unknown (Schneider, 2020b), this has proven to be highly detrimental 
to the performance of polarization methods (Schneider, 2019a). In 
this chapter, we study the combination of polarization methods and 
Anderson acceleration, producing a fast, flexible and versatile general- 


1 This chapter is based on Wicht et al. (2021a). For the sake of a coherent structure, 
formatting and typography of this thesis, minor changes have been made. To avoid 
redundancies in the text, the introduction has been shortened. The discussion of the 
material behavior Sec. 6.3.6 was expanded. 
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purpose FFT-based solver. Anderson acceleration (Anderson, 1965) 
is a method for improving the convergence behavior of fixed-point 
iterations, where derivatives of the fixed-point mapping are not available. 
Based on a limited number (the so-called depth) of previous iterates, 
Anderson acceleration generates the next iterate based on a mixture of 
previous iterates, where the mixing coefficients solve an associated low- 
dimensional optimization problem. Anderson acceleration often leads to 
a substantial speed-up in applications, such as convective flow (Pollock 
et al., 2021), well-fracture (Aksenov et al., 2021), radiation-diffusion (An 
et al., 2017), computer graphics (Zhang et al., 2019) or microstructure 
generation (Kuhn et al., 2020). Anderson acceleration may be interpreted 
as a multi-secant Quasi-Newton method (Fang and Saad, 2009) and 
is "essentially equivalent" to GMRES for linear problems, see Walker 
and Ni (2011). Theoretical convergence assertions were only recently 
provided (Toth and Kelley, 2015; Evans et al., 2020). 

In FFT-based computational micromechanics, the Anderson-accelerated 
basic scheme was included as a solution algorithm in the AMITEX 
software package (Chen et al., 2019b), see Ch. 3 for a comparison to 
other (single-secant) Quasi-Newton methods. Unfortunately, when 
applied to the basic scheme, Anderson acceleration is unable to un- 
leash its full potential. Indeed, when applied to gradient descent (such 
as the basic scheme (Kabel et al., 2014; Schneider, 2017a; Bellis and 
Suquet, 2019)), Li and Li (2020) proved that the convergence rate of 
an Anderson-accelerated gradient method does not improve upon the 
optimum convergence rate of plain gradient descent. This theoretical 
result is backed up by computational experiments in Ch. 3. 

Applying Anderson acceleration to polarization schemes appears much 
more promising. Indeed, most of the time, an optimally tuned polariza- 
tion method is competitive or even outperforms the fastest strain-based 
solvers in terms of iteration count (Schneider et al., 2019; Moulinec and 
Silva, 2014; Monchiet and Bonnet, 2013). Thus, by relieving the user of 
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the daunting task to identify the optimum numerical parameters, the 
Anderson-accelerated polarization scheme turns into a general-purpose 
solver for FFT-based computational micromechanics. We wish to draw 
the reader’s attention to recent applications (Fu et al., 2020; Zhang et al., 
2019; Ouyang et al., 2020) of Anderson acceleration to operator-splitting 
methods, which motivated the present work. 

Please note that Shantraj et al. (2015) investigated the combination of 
a nonlinear GMRES method (Oosterlee and Washio, 2000) (which is 
equivalent to Anderson acceleration) and polarization methods in the 
setting of finite-strain crystal viscoplasticity, and report the Anderson- 
accelerated basic scheme to outperform the Anderson-accelerated po- 
larization methods. However, polarization methods are known to be 
less powerful at finite strains due to the non-convexity of the problem. 
We refer to (Kabel et al., 2014, Sec. 3.2.5) for computational experiments. 
Thus, the conclusions of Shantraj et al. (2015) cannot be transferred to 
the small-strain setting. Furthermore, Shantraj et al. (2015) consider the 
deformation gradient and a rescaled polarization field as iterates of their 
algorithm. However, a recent study by Ouyang et al. (2020) demonstrates 
that it is preferable in terms of iteration counts and run-time to restrict 
Anderson acceleration to the lower-dimensional fixed-point iteration 
of the polarization. In the context of FFT-based micromechanics, this 
corresponds to accelerating the (damped) Eyre-Milton iteration, which 
is the approach we follow in this study. 

This chapter is organized as follows. After recapitulating the basics 
of polarization methods, see Sec. 6.2.1, and Anderson acceleration, see 
Sec. 6.2.2, we present the resulting algorithm in Sec. 6.2.3. In Sec.6.3, 
we perform numerical experiments to evaluate the performance of 
Anderson accelerated polarization methods and compare them to the 
fastest strain-based solution algorithms. 
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6.2 Anderson-accelerated polarization schemes 


6.2.1 The Eyre-Milton equation and polarization schemes 


This section provides a stream-lined presentation of polarization meth- 
ods for FFT-based computational micromechanics at small strains, see 
Schneider et al. (2019) as a general reference. 

In the context of small-strain continuum mechanics, let a cuboid cell Y 
in R’ be given, together with a heterogeneous strain energy density 


w:Y xSym(d) > R%, (ze) w(z,e), (6.1) 


where d = 2,3 denotes the spatial dimension and Sym(d) is the space 
of symmetric d x d matrices. In the following, we assume that w is 
measurable in its first variable and (twice) differentiable in the strain. 
For a general physically nonlinear hyperelastic material, w corresponds 
to the strain-energy density so that the stress operator computing the 
Cauchy stress tensor (x, €) at x in response to the applied (infinitesimal) 
strain € € Sym(d) is defined by the hyperelastic relation 


o: Y xSym(d) + Sym(d), (z,e) > ae). (6.2) 


Alternatively, w may arise as the incremental potential of a generalized 
standard material after time discretization and static condensation of in- 
ternal variables, see Miehe (2002). Assuming vanishing non-equilibrium 
stresses, the condensed incremental potential permits the hyperelastic 
definition (6.2) of the stress operator. Note that, in this case, w has no 
intrinsic physical meaning, as it depends on the chosen time-integration 
scheme and mixes the Helmholtz free energy and the dissipation poten- 
tial of the material. For the convenience of the reader, we suppress the 
x-dependency of w and o in the following. 
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Introducing the space of periodic and mean-free displacement fluctua- 
tions 


H} (Y; RÎ) = { u:R?—> R? 


u periodic, 0,,u anti-periodic on OY, / udr = o} , 
Y 
(6.3) 


we seek a solution u € H4 (Y; R2), which satisfies the static balance of 
linear momentum without volume forces 


div o(€ + V*°u) =0 (6.4) 


for a prescribed macroscopic strain €. The corresponding space of square- 
integrable stress- and strain-fields L?(Y;Sym(d)) is endowed with the 
inner product 


(E1, E2) L2 = en eılz):&2(z)de for €1,€2 € L’(Y;Sym(d)), 

j (6.5) 
where |Y | denotes the volume of the cell Y. Assuming that the stress 
for vanishing strain is square-integrable, w is -strongly convex in its 
second variable 


(a(€1) = a(€2),€1 —€9) 12 > u || ey — E3 ee YE, E2 € L?(Y;Sym(d)), 


(6.6) 


and has an L-Lipschitz gradient 


||o(eı) — o(ea)||r2 < L|] - ea | r2 YE, E2 € L?(Y;Sym(d)), (6.7) 
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the balance of linear momentum has a unique solution (Bellis and Suquet, 
2019). This permits to define the effective stress 7 associated to the strain 
loading € 


T(E) = wer dx, (6.8) 


where u € H 4(Y; R1) solves equation (6.4). For more general existence 
results for monotone operators’, which are not necessarily derived from 
a potential, we refer to Ch. 22 in Bauschke and Combettes (2017). 
It can be shown (Schneider, 2015) that for any displacement fluctuation 
field u solving equation (6.4) and any reference stiffness C°, the total 
strain € = € + Vĉu € L?(Y;Sym(d)) solves the Lippmann-Schwinger 
equation 

e+I°: (a(e)-—C®:¢) =e, (6.9) 


where T° denotes Green’s operator associated to C° (Mura, 1987), 
a bounded linear operator on L?(Y;R“). Conversely, suppose that 
e € L?(Y;Sym(d)) solves the Lippmann-Schwinger equation (6.9) for 
some reference stiffness C°, then we may find u € H4(Y;R"), s.t. 
£ = € + V*uand u solves the balance of linear momentum (6.4), see, for 
instance, Schneider (2015). 

The Lippmann-Schwinger equation serves as the basis of successful 
numerical algorithms for solving the balance of linear momentum 
(6.4), see Ch. 3 for an overview. Alternatively, we may investigate a 
formulation based on the polarization field P = o(e) + C° : g, i.e., the 
Eyre-Milton equation (Eyre and Milton, 1999) 


PYZP SoC" se (6.10) 


2 In the present setting, -monotonicity of ø is implied by (6.6). 
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in terms of the operator 
Y? =1—2C° : T°, (6.11) 
a non-local reflection operator on L?(Y;Sym(d)), and the operator 
Z° =1—2C +64), Pe Pot :(o+C°) (P), (6.12) 


a nonexpansive and local operator on L?(Y;Sym(d)). More precisely, 
the operator Y” satisfies the reflection identity Y°? o Y° =I, formulated 
in terms of the identity operator I on L? (Y ; Sym(d)). Furthermore, Z is 
well-defined, as the operator £ œ o(e) + C° : e is invertible due to the 
strong convexity of w and the non-degeneracy of the reference stiffness 
Gs; 

For any solution e of the Lippmann-Schwinger equation (6.9), the polar- 
ization field P = o(e) + C° : e solves the Eyre-Milton equation and vice 
versa, as a direct implication of the Eyre-Milton identity 


20° : (14T? (o —C°)) = (1-Y° : 2 )(o +00), (6.13) 


a simple algebraic rewriting of the Lippmann-Schwinger equation (Schnei- 
der et al., 2019, Sec. 2). For any damping parameter a € (0, 1], we may 
consider the damped Picard iteration associated to the Eyre-Milton 


equation 
Prrı = a Pp + (1 — a) [20° : €+ Y° : Z°(P)] , (6.14) 


which is called polarization scheme (Monchiet and Bonnet, 2012; 
Moulinec and Silva, 2014). Under the hypotheses of this section, for any 
initial value PP € L?(Y;Sym(d)), reference stiffness C° and damping 
parameter a € [0, 1), the iterative scheme (6.14) converges to a solution 
of the Eyre-Milton equation (6.10). This is a direct consequence of 
the identification (Schneider et al., 2019, Sec. 3) of the polarization 
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scheme (6.14) as the Douglas-Rachford method (Lions and Mercier, 
1979), and the tight linear convergence bounds for the Douglas-Rachford 
splitting established by Giselsson and Boyd (2017), see (Schneider, 2019b, 
Sec. 3.1). 

Restricted to the class of reference materials proportional to the identity, 
explicit formulae for obtaining the optimum convergence rate are 
available. More precisely, if C° = 1/s I holds in terms of a positive 
number y, the distance of the iterates of (6.14) to the fixed point P* 
decreases by 


|Pk+1 — P*||ız < (a+ (1 — a)ö)|| Pk — P*|| 22 (6.15) 
with tot i 
u sL—-1 su - 
5 = max (7), (6.16) 


see Theorem 2 in Giselsson and Boyd (2017). The best convergence rate 
is achieved by setting y = 1/\/uL and a = 1, leading to 


Lit . 
Pera — Pl < an pijp (6.17) 


At this point, some remarks are in order: 

1. For simplicity of exposition, we restricted to the case where the stress 
operator derives from a convex potential. However, Giselsson (2017) 
established linear convergence estimates for the Douglas-Rachford 
method in the context of monotone operator theory, which places less 
restrictions on o. In fact, -monotonicity and L-Lipschitz continuity 
of o are sufficient to prove linear convergence. 

2. In practice, it can be difficult or, at least, computationally demand- 
ing to determine the optimum parameters for convergence. If w 
is twice differentiable, u and L may be obtained by the minimum 
and maximum eigenvalue of the algorithmic tangent field C78 = 
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ö?w/öe?, respectively. For small-strain materials, the maximum 
slope of the stress operator is typically not larger than the maximum 
slope of the algorithmic tangent at zero strain. Thus, for example for 
elastoplasticity, L may be estimated from the maximum eigenvalue 
of the initial elastic stiffness, maximized for all x € Y. Computing 
, on the other hand, may require an eigenvalue decomposition of 
C8, which is computationally expensive. Hence, an approach should 
be identified, which minimizes how often u and L are computed 
while preserving the convergence rate of the scheme, see Sec. 6.3.2 for 
further discussion. 

. The situation becomes even more difficult for stress operators which 
do not derive from a potential. In this case, the optimum choice of 
algorithmic parameters depends on the regularity conditions of o, 
such as L-Lipschitz continuity or 1/L-cocoercivity, see Theorems 6.5 
and Theorem 7.4 in Giselsson (2017). In practice, it is difficult to check 
these conditions for a given material law, further complicating the 
choice of y. 

. Although setting a to zero is typically the theoretically optimum 
choice (Giselsson and Boyd, 2017; Giselsson, 2017), this may decrease 
the robustness w.r.t. numerical errors of the polarization scheme 
(Schneider et al., 2019, Sec. 7.3). 

. The quantity y used for parameterizing the reference material via 
C? = 1/7 Lis related to the shear modulus of the reference material 
by 2u° = 1/7. From a numerical point of view, the parameter y 
specifies the (relative) step-size of the Douglas-Rachford scheme 
(Schneider et al., 2019, Sec. 3). Indeed, as Moulinec-Suquet’s basic 
scheme corresponds to an explicit gradient-descent method (Kabel 
et al., 2014; Schneider, 2017a; Bellis and Suquet, 2019), the polarization 
scheme (6.14) with a = 0 corresponds to an implicit gradient-descent 
method. For equal step sizes, both methods lead to similar conver- 
gence behavior (Schneider et al., 2019, Sec. 7). Owing to the explicit 


181 


6 Anderson-accelerated polarization schemes for FFT-based homogenization 


updates, the basic scheme suffers from a step-size restriction in order 
to retain stability. In contrast, due to the implicit nature of the updates, 
polarization schemes (6.14) are stable for any step size. In particular, 
much larger step sizes than for the basic scheme can be used. The 
latter phenomenon is responsible for the improved convergence speed 
of the polarization methods compared to the basic scheme. Moreover, 
the relaxation (6.14) of the fixed-point scheme by a factor a may also 
be applied to the basic scheme. However, the resulting scheme will be 
equivalent to the basic scheme with a different step size. In contrast, 
for polarization methods, relaxation leads to more general methods. 
Overall, we may conclude that the choice of the step size y and the 
relaxation parameter a is not straight-forward. This is particularly 
bothersome, since the convergence rate of polarization-based schemes 
exhibits a strong sensitivity w.r.t. these parameters (Schneider et al., 2019, 
Sec. 7). Thus, for problems where estimates for u or L are not available, 
polarization-based schemes may perform poorly, see Section 3.2 in 
Schneider (2019a), limiting their usefulness as general-purpose solvers. 
In the following Sections 6.2.2 and 6.2.3, we shall discuss how Anderson 
acceleration (6.14) may counterbalance the slow convergence behavior 
of polarization-based schemes for suboptimal parameter choices. 


6.2.2 Anderson acceleration for fixed-point iterations 


Suppose a (nonlinear) operator F : X — X is given, mapping a Banach 
space X into itself, which is Lipschitz-continuous with Lipschitz constant 


0 


p < 1. For any initial value x" € X, Banach’s fixed point theorem 


(Banach, 1922) asserts that the iterative scheme 


Try = F (xp) (6.18) 
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converges to the unique fixed point «* of F with rate p, i.e., 
Iar+ı — @"|[x < pller — a" |x (6.19) 


holds. Anderson acceleration Anderson (1965), sometimes also called 
Anderson mixing, is a method applicable to general fixed-point iterations 
(6.18). It aims at improving the convergence properties of the Picard 
iteration (6.18) for cases where derivatives of F are either not available 
or expensive to compute. 

Anderson acceleration depends on a non-negative integer m called depth. 
For m = 0, it reduces to the original Picard iteration (6.18). For general 


m > 0, to determine the next iterate 7,4, Anderson acceleration “mixes” 
the last m; + 1 iterates 


mr+1 


Tk+1 = 5 oj F (te41-4); (6.20) 


i=1 


where m; = min(k,m) and the coefficients a, € R™”**! are chosen to 


minimize the function 


aire + ark- +. + ag" rem || x > (6.21) 
where rg = £k — F(x) denote the residuals, subject to the mixing 


constraint 
mrt] 


> a1. (6.22) 
i=1 


The formulation (6.20) involves applying the nonlinear operator F (mp + 1) 
times for each iteration step of Anderson mixing. As evaluating the 
operator F is typically the most expensive step, practical implementa- 
tions are based on the (already) computed residuals r’ instead, using the 
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equivalent update formula 


mr-+1 


Tk+1 = 5 ar, Bir]: (6.23) 


i=1 


In this way, the nonlinear operator F needs to be evaluated only once 
per Anderson iteration. Also, if X is a Hilbert space, the minimization 
problem (6.21) simplifies to a quadratic programming problem for a; 
which may be solved by 


1 t 


ak = ANI, 6.24 
k iP Ali k= ( ) 


where 1 is a vector of all ones in R™»++!, A, is the symmetric positive 
semidefinite matrix 


(Thy Tk) X (Thy Tk—-1)X tee (Tk, Tk—mp) X 
(Tk-1;Tk)X Pedy Vet x tee (Peds Thom) XR 
Ar = . i . 
Temas Tki X igus Tena xe ous Cie aa he ea X 


(6.25) 


and At is the Moore-Penrose pseudoinverse (Moore, 1920; Penrose, 
1955) of the matrix Ax. In case the matrix A; is ill-conditioned and X is 
finite-dimensional, Fu et al. (2020) recommend solving the optimization 
problem (6.21) based on a singular value decomposition (SVD) of the 
matrix 


Tk — Tk—-1 °? Lk—-m,+1 — Tk—m, € RR, (6.26) 
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However, this approach requires a higher memory footprint than the pro- 
cedure based on the pseudoinverse. Furthermore, we did not encounter 
ill-conditioning of the matrix A; during our numerical experiments, see 
section 6.3. This suggests that the SVD-approach may not be necessary 
for the problem at hand. 

At the end of this section, we wish to put Anderson acceleration into 
context, and report on recent convergence assertions. Anderson accel- 
eration may be interpreted as a Quasi-Newton method of multi-secant 
type (Fang and Saad, 2009). Applied to linear problems, Walker and Ni 
(2011) showed that Anderson acceleration is “essentially equivalent” to 
GMRES with depth m (Saad and Schultz, 1986). Toth and Kelley (2015) 
showed that Anderson acceleration does not decrease the convergence 
rate of linearly converging fixed point iterations. Furthermore, Evans 
et al. (2020) showed that Anderson acceleration improves upon the 
convergence rate of linearly convergent fixed-point iterations, but not 
for those converging quadratically. However, some caution is advised 
for these results, because they assume that the coefficients a, remain 
uniformly bounded (and uniformly bounded away from zero) in k. This 
assumptions is difficult to verify in practice, as it is an assumption 
on the Anderson acceleration procedure and not an assumption on 
the fixed-point mapping F. Furthermore, Anderson acceleration may 
also converge if the original mapping F was not contractive Both et al. 
(2019). For stationary Anderson acceleration with fixed coefficients ax, 
De Sterck and He (2020) provided convergence estimates for accelerating 
gradient-descent, drawing on the similarity to Nesterov’s scheme (Nes- 
terov, 1983) for m = 1. In numerical tests, the authors found that the 
convergence rate of the stationary version provides a rough performance 
estimate for the classical Anderson acceleration. Using a similar strategy, 
Wang et al. (2021) investigated the speed up for accelerating ADMM, 
which may be interpreted as a dual version of the Douglas-Rachford 
splitting (Giselsson and Boyd, 2017). 

Of particular interest is the work of Li and Li (2020). They demonstrate 
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that when Anderson acceleration is applied to gradient descent and 
strongly convex functions with Lipschitz gradient, the convergence 
rate is not improved compared to an optimally tuned gradient-descent 
scheme. At first, this result appears discouraging because other methods, 
for instance fast gradient solvers (Nesterov, 2004), lead to an improve- 
ment of the convergence rate. However, the problem with convergence 
assertions for Anderson acceleration is its finite depth m. Suppose, 
for instance, we consider solving a symmetric linear system. Suppose 
that we obtain a sufficiently accurate solution with MINRES (Paige and 
Saunders, 1975) in K steps. Then, choosing m > K, GMRES(m) gives 
identical iterates as MINRES. As Anderson(m) is essentially equivalent 
to GMRES, it also converges as quickly as MINRES. However, this 
speed is not reflected in convergence rates, because they always consider 
infinite sequences. 

Also, the Li and Li (2020) result may be interpreted in a positive way 
by noticing that it may be extremely hard to tune the parameters of 
gradient-descent schemes in an optimum fashion. Thus, Anderson 
acceleration may indeed lead to a benefit in practice, also for gradient 
descent. As Moulinec-Suquet’s basic scheme (Moulinec and Suquet, 
1994; 1998) is essentially a gradient-descent method for stress operators 
with potential (Kabel et al., 2014; Schneider, 2017a; Bellis and Suquet, 
2019), we may interpret the positive results of Gélébart’s AMITEX 
solver (Chen et al., 2019a;b), who applied Anderson acceleration to 
the basic scheme, as a testament for this statement. 

In this work, we shall follow a slightly different path by applying 
Anderson acceleration to the polarization scheme (6.14), and use it for 
avoiding tedious parameter calibration. 
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6.2.3 Application to polarization schemes 


Polarization schemes (6.14) are fixed-point methods (6.18) for the non- 
linear mapping 


Fy,q: L*(Y;Sym(d)) > L*(Y;Sym(d)), 


(6.27) 
PraP+(1-a)|-:2+Y°:z(P)|, 
Y 


where we restricted to C? = 1 /y I for simplicity. Then, the operators y? 
and Z° attain the form 


Y’=1I-2T for T= V’(div V‘) div (6.28) 


1 
Zz? = («-=1) (o+=1) . (6.29) 


In the general setting of section 6.2.1, for any y > 0 and a € (0, 1], the 
operator F} a is non-expansive, i.e., Lipschitz continuous with Lipschitz 
constant 1. If, furthermore, w is strongly convex, for any y > 0 and 
a € [0,1), F,, is even contractive in view of the estimate (6.17). Unfortu- 
nately, it is not always apparent how to choose the parameter pair (y, a) 
to ensure fast convergence. Even though explicit values for (y, a) were 
listed, their practical determination may be expensive. Indeed, suitable 
constants u and L may be read off from eigenvalue analyses based 
on the material tangents 82 (x, e(x)) if ø is continuously differentiable. 
However, if the voxel count is large, the sheer number of eigenvalue 
decompositions may be expensive per se. 

Thus, we apply Anderson acceleration to the contractive operator Fy a 
(6.27), as discussed in section 6.2.2. Performing a single step of the 
polarization scheme is summarized in Alg. 6. Notice that we do not 
compute F} ,(P), but its polarization residual P — F} a( P), because the 
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latter enters in the Anderson matrix (6.25) and in the Anderson update 
(6.23). 


Algorithm 6 Polarization step DR, «(Poia), see Schneider et al. (2019) 


1: P 4+ Pow 

2 P+-ZV(P) > Update estimates of u and L 
3: P + FFT(P) 

4 P(E) + P(é) —21(€): P(é), € 40 > Apply Y° operator 
5: P(0) + 2 E > Fix average polarization 
6: P+ FFT!(Ê) 

7: residual + un Des > (o)y is a byproduct of Z° 
8: return residual, (1 — a)( Pya — P) 


Alongside we compute the residual 


: _ LIP- Fy o(P)|lz2 
residual( P) = 3 Weer ’ (6.30) 


where (o(e))y denotes the average stress associated to the polarization 


P=o(e) + 1/y £. The residual (6.30) measures the strain compatibility, 
the stress equilibrium and the average value of the strain in view of the 
identity (Schneider et al., 2019, Sec. 5) 


TIP - F,o(P)IP = I : (e)l 
í (6.31) 
+23 (IO-T = (v elle + ey = Eee) 


The residual (6.30) depends on the step size y, so some care has to 
be taken in comparing different solution schemes. However, this phe- 
nomenon is intrinsic, because conditions on the strain and the stress 
field have to be enforced, and the step size helps converting between the 
different physical units of strain and stress. 
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The most expensive step for nonlinear material behavior is evaluating the 
operator Z° (6.12). In computational practice, it is often more convenient 
to use the equivalent expression 


Z°(P) =2 (o-1+D°)~* (D° : P) — P, (6.32) 


where D° is the reference compliance. Indeed, for inelastic materials 
whose stress-strain relationship is governed by Hooke’s law, the operator 
Z° may be computed by a standard call to the user-defined material law 
(with a modified stiffness) (Schneider et al., 2019, Sec. 6). The average 
stress 


(o(e))y = (07! +D?) (D : P))y (6.33) 


is easily computed as a byproduct. The algorithm is summarized in 
Alg.7, where a hat over a variable refers to the corresponding Fourier 
coefficients. The method may be implemented on 2(m + 1) strain-like 
fields. Implementations on the displacement field, as in Grimm-Strele 
and Kabel (2019), are not feasible because the iterates are not compatible. 


6.3 Numerical demonstrations 


6.3.1 General setup and organization 


The Anderson accelerated polarization-based schemes, abbreviated as 
A2DR (Anderson-Accelerated Douglas Rachford), following Fu et al. 
(2020), were implemented in an in-house FFT-based micromechanics 
solver, written in Python 3.7. Computationally expensive operations, 
such as applying T9, evaluating the material law and the Anderson 
update (6.23), were realized as Cython extensions using OpenMP par- 
allelization. For applying the fast Fourier transform, we use the FFTW 
library (Frigo and Johnson, 2005). Throughout, we rely on the staggered- 
grid discretization (Schneider et al., 2016). To describe the action of the 
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Algorithm 7 Anderson-accelerated polarization scheme (a, maxit, tol) 


1: P0 > Alternative: Extrapolation from previous time steps 
2: initialize y > Different choices possible, see Sec. 6.3 
3: k-0 
4: initialize empty list £ 
5: while k < maxit do 
6: kiek+1 
7: Poa < P 
8: residual, R + DR, a( Poa) > See Alg. 6; update u and L 
9: if residual < tol then 
10: exit while loop 
11: end if 
12: update step size y (may be omitted for performance reasons) 
13: append P and R to the list £ 
14: compute inner products of R with older R’s from £ 
15: update matrix A (6.25) 
16: determine a by equation (6.24) 
17: compute new P by equation (6.23) 
18: discard superfluous R’s and K’s from £ 
19: end while 
20: € + (C°+0)71(P) > Compute strain field 


21: return £, residual, k 


corresponding discrete I’, operator, we introduce the complex vectors 


kj() = eee. (6.34) 


where € denotes a frequency vector and h; and N; are the mesh spacing 
and voxel count in j-direction, respectively. Then, the associated sym- 
metrized gradient operator D of the staggered-grid discretization has 
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the Fourier-space representation 


_ 1 g 2ky ty | = (ky tz +keti) - (kids + kzüı) 
Du(£) = |” (k2û1 T ky tig) 7 2kzûa = (kzûs T k3fiz) : 
— (kgt + kitig) — (ksûz + keds) 2kzûs 
(6.35) 


where we suppress the €-dependency of k and u for better readability. 
The action of T`, in Fourier-space reads 


a 2 kk? \ Hea 
Fre) = D (iip 1i) Me (6.36) 
0, otherwise, 


where D* (£) is the Hermitian adjoint of D(é), 


Er —Fıkı + Tızka + tisks 
(D*r)(€) = Taıkı — Taaka + Tozkz |. (6.37) 


Tzıkı + Taaka — T33k3 


Our convergence criterion for the polarization-based schemes reads 


1 ||P — Fs,0(P)Ilz2 
2 Kerl 


using the residual defined in equation (6.30). For the strain-based 


<6 (6.38) 


schemes, which serve as performance benchmarks for A2DR, we use 


Wazee) ag, (6.39) 

Kol) vl 
Note that the criterion of the polarization-based schemes (6.38) checks 
the compatibility of the strain field and the deviation from the prescribed 
macroscopic strain in addition to the equilibrium of the stress field in 
(6.39), as each condition is only satisfied upon convergence. Unless 
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explicitly stated otherwise, we solve to a tolerance of ô = 10~°. In case 
of multiple load steps, we use an affine linear extrapolation (Moulinec 
and Suquet, 1998) as an initial guess for our solution field. 


The computations in Sec. 6.3.2-6.3.5 were performed on a desktop 
computer with 32 GB RAM and a 6-core Intel i7-8700K CPU. The com- 
putations for Sec. 6.3.6 ran on a workstation with 512 GB RAM and two 
12-core Intel Xeon(R) Gold 6146 CPUs. 


These computational investigations are intended to demonstrate the 
power and versatility of A2DR, and are organized as follows. We start 
with a two-dimensional example in section 6.3.2, which permits us 
to study the dependence on the involved algorithmic parameters. In 
three dimensions, studies with large depth are prohibited by memory 
constraints. In section 6.3.3, we study a three-dimensional example 
with nonlinear constituents, but finite material contrast. The example 
serves as a standard benchmark for FFT-based solvers (Schneider, 2019a; 
2020a). In section 6.3.4, we study a linear elastic material including 
pores. Porous microstructures are known to be difficult for polarization 
methods, because the optimum step size y = 1//uL is not sensible for 
u = 0. In section 6.3.5, we study a Metal-Matrix-Composite (MMC) 
undergoing ratcheting. This example is challenging for two reasons. 
For a start, the underlying material model is not a generalized standard 
material, as the material tangent is not symmetric. In particular, the 
convergence theory discussed in section 6.2 does not apply. As a second 
challenge, the material tangent becomes increasingly ill-conditioned 
for increased loading. Last but not least, in section 6.3.6, we study a 
polycrystalline microstructure. Such constitutive laws are notoriously 
expensive to evaluate. Thus, Newton-Krylov methods are usually the 
preferred choice for this type of problem, as iterations of the linearized 
problem require much less computational effort than the nonlinear 
evaluation. Furthermore, the specific material law we utilize involves 
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a softening behavior. In particular, the example is not covered by the 
available convergence theory. 


6.3.2 Continuous glass-fiber reinforced polymer 


(a) Microstructure 


(b) Accumulated plastic strain at E33 = 5% 


Figure 6.1: Continuous glass-fiber reinforced polymer - Microstructure and accumulated 
plastic strain for a uniaxial extension in z-direction 


As our first example, we consider polyamide 6.6 continuously reinforced 
by glass fibers with a volume fraction of 30%. The microstructure, see 
Fig. 6.1, is modeled as a two-dimensional periodic cell, generated by 
the adaptive shrinking cell algorithm of Torquato and Jiao (2010). The 
resulting structure contains 200 fibers and is resolved by 512 x 512 pixels. 
The glass fibers are modeled as isotropic linear elastic and the polyamide 
matrix is governed by J»-elastoplasticity with isotropic hardening, see 
Sec. 3.3 in Simo and Hughes (1998). Following Doghri et al. (2011), we 
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Table 6.1: Glass-fiber reinforced polyamide: Material parameters of fibers and ma- 
trix Doghri et al. (2011) 


Fibers E = 72 GPa, v = 0.22 
Matrix E = 2.1 GPa, v = 0.3, oy = 29 MPa, 
kı = 139 MPa, ko =32.7MPa, m = 319.4 


use a linear exponential hardening function for polyamide 
oy = 00 + kip + ka(l — exp(-mp)), (6.40) 


where oo denotes the initial yield stress, kı is the asymptotic hardening 
modulus and kz = 09 — 0% specifies the difference between the initial 
and saturated yield strength for kı = 0. The material parameters 
are listed in Tab. 6.1. Please note that a similar microstructure was 
considered in Ch. 3. For the present section, however, twice the volume 
fraction, and four times the resolution are investigated compared to 
Ch. 3. In particular, we observe a more pronounced plastification caused 
by the higher filler fraction. 

For this comparatively small two-dimensional example, we investigate 
the convergence rate of A2DR with respect to the chosen depth m. 
In particular, we are interested in the sensitivity of the results with 
respect to the chosen algorithmic parameters. As shown in Sec. 6.2.1, 
the convergence rate of the non-accelerated polarization-based schemes 
depends on the choice of the step size y and the damping parameter a. 
Indeed, it was shown that, in practice, choosing a suboptimal step size 
may increase the necessary iteration counts by orders of magnitude, see 
Sec. 7 in Schneider et al. (2019). Hence, we aim to find a suitable depth 
m for which the dependence of performance on y and a is eliminated or, 
at least, reduced. 
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Figure 6.2: Continuous glass-fiber reinforced polymer with elastic matrix 


For a start, we consider a linear problem, where both glass fibers and 
polyamide matrix are modeled as linear elastic. Using the formulation 
of mixed boundary conditions by Kabel et al. (2016), the microstructure 
is subjected to 5% uniaxial extension in z-direction. For a fixed step 
size y = 2/(u + L), the required iteration counts and total run-times for 
A2DR for different values of a and m are plotted in Fig. 6.2a. 

As a general trend, we observe that the required number of iterations 
decreases up to a depth of m = 4 and stagnates afterwards. This decrease 
is not necessarily monotone, see the plot for a = 0.5. The total run-times 
follow a similar trend. Anderson acceleration introduces only a small 
overhead. In particular, for this section, it suffices to investigate either 
the iteration count or the timing. We revisit this topic for the larger 
microstructure in Sec. 6.3.3, where the computational cost of the update 
steps (6.23)-(6.25) is more pronounced. 

Taking a look at the effect of the damping parameter a for m = 0 (i.e., 
when Anderson acceleration is deactivated), we note that the iteration 
counts range from 205 for Monchiet-Bonnet’s choice a = 0.25 (Monchiet 
and Bonnet, 2012) to 310 for a = 0.5, corresponding to Michel-Moulinec- 
Suquet’s accelerated scheme (Michel et al., 2001). For depth m > 4, 
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the difference between these choices is largely eliminated, and A2DR 
converges in roughly 50 iterations for all damping factors considered. 
These results indicate that, in addition to the faster convergence, A2DR 
relieves the user from the task of selecting the damping factor carefully. 
Next, we take a look at influence of the step size y for a fixed damping 
factor of a = 0.25, see Fig. 6.2b. Starting at m = 0, the slowest choice 
of y = 1/L requires about 10 times as many iterations to converge 
compared to the theoretically optimum choice y = 1/YuL. Activating 
Anderson acceleration reduces this performance gap significantly. Up 
to a depth of m = 3, the iteration counts decrease for all step sizes 
and stagnate afterwards. For the optimum step size the effect is least 
pronounced, with a decrease of 46 iterations for m = 0 to 35 iterations 
for m = 3. However, for all other step sizes the iteration counts are 
significantly reduced, leading to a factor of less than 2 between the 
slowest and fastest choice for m > 3. Notably, the iteration count for 
y = 1/u matches that of the theoretically optimum choice for m = 3 and 
is even slightly lower for higher depths. 

In conclusion, we observe that the performance of A2DR with a depth 
of m = 4 is largely independent of the damping factor a. The influence 
of the step size on performance does not vanish, but is significantly 
reduced compared to the classical polarization-based schemes without 
Anderson acceleration. In particular, step-size choices such as y = 1/L 
or y = 2/(u + L) become competitive using A2DR, making polarization- 
based schemes available for materials where u is close or equal to 0, see 
Sec. 6.3.4 and 6.3.5. 

To check whether the results of the linear elastic setting carry over to 
nonlinear problems, we consider the case where the polyamide matrix is 
governed by J2-elastoplasticity. The boundary conditions are imposed 
as for the linear elastic case, i.e., 5% uniaxial extension in z-direction, 
applied in 50 equidistant load steps. For computing the step size, ju 
and L are estimated in the first iteration of each load step based on 
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Figure 6.3: Continuous glass-fiber reinforced polymer with elastoplastic matrix - Iteration 
count vs depth m for various step-size choices and fixed damping factor a = 0.25, 
including a zoom on the right-hand side 


the tangent field, see remark 2 in Sec. 6.2.1. The step size is then kept 
fixed for the remainder of the load step. Using this strategy, the effect 
of Anderson acceleration with respect to different damping factors and 
step-size choices is qualitatively similar to the linear elastic setting, see 
Fig. 6.3a and Fig. 6.3b. For depths m > 4, the impact of the damping 
factor is largely eliminated and iteration counts stabilize. However, the 
difference between the investigated step-size choices is more pronounced 
in the nonlinear case. For the unaccelerated schemes, the slowest choice 
y = 1/L requires 33 times more iterations than the optimum choice 
y = 1/VuL. Ata depth of m = 4, the factor between the slowest and 
fastest step size is reduced to 4. 

A few synoptic remarks are in order. As an alternative strategy for 
computing the step size y, we investigated the approach of Schneider 
et al. (2019), where u and L are computed in every iteration and y is sub- 
sequently updated based on the minimum value of u and the maximum 
value of L over all past iterates. Upon Anderson acceleration, no positive 
effect on the convergence behavior was observed in practice, except for 
very large nonlinear load steps. Furthermore, the overall performance 
with respect to run-time suffered due to the overhead of computing u 
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and L. Thus, we use the simpler strategy of updating and L at the 
beginning of each load step for the remainder of the manuscript. As a 
second remark, based on the results up to this point, choosing y = 1/u 
seems to be competitive when using A2DR, as the resulting performance 
was Often similar or better than for the theoretically optimum value of 
y = 1/V pL. However, whenever u is unknown or zero, both, y = 1/u 
and y = 1/yYuL cannot be used. In addition, y = 1/1 was found to 
result in low convergence rates for high accuracy. Hence, we prefer 
y = 1/VuL where applicable. 

Larger values for the depth m, up to 200, were tested for A2DR, leading, 
however, to no further decrease of the iteration count. Thus, for the 
sake of readability, these results were omitted in the respective plots 
of this section. Interestingly, this strongly differs from the behavior 
observed for the Anderson-accelerated basic scheme, see Sec. 3.4.2, 
where iteration counts were found to decrease up to depths of m = 50 
(albeit at the cost of slower overall performance, due to computational 
overhead). This further demonstrates the difficulty of finding the opti- 
mum step size of the basic scheme. As exemplified by adaptive step-size 
selection-schemes, e.g., by Barzilai and Borwein (1988) or Malitsky and 
Mishchenko (2020), a constant step size of y = 2/(j + L) does not 
yield the best possible performance for gradient descent when time 
to solution is the primary objective. In contrast, using the optimum 
constant step size for polarization-based schemes appears to leave less 
room for improvement. 


6.3.3 Short glass-fiber reinforced polymer 


Motivated by the results for the 2-dimensional example of the last section, 
we compare the performance of A2DR to other modern FFT-based 
solvers, based on a larger 3-dimensional microstructure that serves as 
a recurring benchmark example for FFT-based solvers, see Sec. 3.3 in 
Schneider (2019a). More precisely, we consider a polyamide matrix, 
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(b) E11 = 5% 


(a) Microstructure 


Figure 6.4: Short glass-fiber reinforced polymer - Microstructure and von Mises strain-field 
for uniaxial extension in x-direction 


reinforced by 1140 glass fibers with an aspect ratio 30, filling 20% of the 
volume, see Fig. 6.4. The microstructure was generated using the sequen- 
tial addition and migration algorithm (Schneider, 2017b) and resolved by 
256° voxels. The fiber orientation in the resulting microstructure is close 
to unidirectional with a second-order fiber-orientation tensor (Advani 
and Tucker, 1987) of A = diag(0.8,0.1,0.1). Throughout, we use the 
material parameters listed in Tab.6.1, as in Sec. 6.3.2. 

First, we restrict to the linear elastic problem with an applied loading 
of 5% uniaxial extension in fiber direction. We solve up to a high 
accuracy of ô = 10”! to get a better picture of the convergence rate 
of the investigated FFT-based solvers. We compare the performance of 
A2DR with a = 0.25, m = 4 and the optimum step size y = 1//uL, 
to Monchiet-Bonnet’s scheme with y = 1/YuL Monchiet and Bonnet 
(2012), the conjugate gradient (CG) method (Zeman et al., 2010; Brisard 
and Dormieux, 2010), the Barzilai-Borwein (BB) basic scheme (Barzilai 
and Borwein, 1988; Schneider, 2019a) and the original basic scheme by 
Moulinec and Suquet (1998). Throughout, the optimum algorithmic 
parameters are used for the Lippmann-Schwinger solvers. To be precise, 
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we choose y = 2/(u + L) as the reference material of the basic scheme 
and the CG method (where it does not matter (Zeman et al., 2010)). The 
Barzilai-Borwein (BB) method is initialized with y = 2/(~ + L) as well 
and adaptively selects its step size after the first iteration Schneider 
(2019a). In Fig. 6.5a, we see the excellent convergence rate of A2DR, 
reaching the prescribed tolerance with the lowest number of iterations 
among all investigated solvers. Most notably, the performance of A2DR 
and CG is nearly identical. Interestingly, for the same benchmark, see 
Sec. 3.3 in Schneider (2019a), the author already observed that the 
(non-accelerated) Eyre-Milton method mirrored the performance of 
CG for low accuracy. Using Anderson acceleration, this advantage 
is preserved up to the investigated tolerance of 5 = 10~'°. Both the 
Barzilai-Borwein (BB) method and the Monchiet-Bonnet scheme exhibit 
similar convergence rates up to an accuracy of 1077. Subsequently, the 
residual of the Barzilai-Borwein method decreases rapidly, leading to a 
lower final iteration count. Note, however, that this only a fortuitous 
byproduct of the inherently non-monotone convergence behavior of the 
algorithm’. In the aforementioned numerical experiment in Sec. 3.3 of 
Schneider Schneider (2019a), the final iteration count of the Barzilai- 
Borwein method was very close to the one we observe for Monchiet- 
Bonnet’s method, which is roughly 50% higher than the iteration counts 
of A2DR and CG. The basic scheme is not competitive, being an order of 
magnitude slower than the other investigated schemes. 

Taking a look at the overall performance in terms of computation time, 
the ranking between the solvers changes slightly, see Fig. 6.5b. Es- 
sentially, the Barzilai-Borwein (BB) method and A2DR switch places, 
with one being slightly faster and the other being slightly slower than 
CG. This is due to the lower computational cost per iteration of the 
Barzilai-Borwein method, see Tab. 6.2. Using the complexity-reduction 


3 Also, a rapid decrease of the residual from 107? to 1075 may be observed around 
iteration 50. 
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Figure 6.5: Short glass-fiber reinforced polymer - Performance comparison for various 
solution schemes 


trick in Sec. 6 of Schneider et al. (2019), for all investigated algorithms, 
the computational effort of evaluating the material law, applying the 
T?-operator and computing the residual are very similar. Whereas an 
iteration of the Barzilai-Borwein method only requires a single inner 
product and one addition of two fields on top of the aforementioned 
steps, the A2DR update involving equations (6.23)-(6.25) requires com- 
puting m + 1 inner products, solving a linear system of size m + land 

summing m + 1 fields. As a consequence, the cost per iteration for 

A2DR(m = 4) ends up at being roughly 50% higher compared to the 

Barzilai-Borwein method. 

Last but not least, we consider the nonlinear problem with J2-elastoplastic 
matrix behavior. The uniaxial loading up to 5% uniaxial extension is ap- 
plied in 50 equidistant steps. We add the Newton-CG method (Gelebart 

and Mondon-Cancel, 2013; Kabel et al., 2016) to our list of investigated 

schemes in place of the linear CG method. To be precise, we use Dong’s 

line search criteria (Dong, 2010) for controlling the step size of the 

Newton update and prescribe Eisenstat-Walker’s forcing term choice 
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Table 6.2: Short glass-fiber reinforced polymer - computational cost per iteration for the 
investigated solution schemes 


Iter. count Total run-time Run-time per iter. 


A2DR 79 184.2 s 2.33 s 
Monchiet-Bonnet 116 222.3 s 1.91s 
CG 82 159.3 s 1.94s 
BB 93 149.7 1.61 

Basic scheme 763 1210.7 s 1.59 s 


2 (Eisenstat and Walker, 1996) as tolerance for the linear system, see 
Ch. 3. Note that, for Newton-CG, we take the sum of Newton iterations 
and linear CG iterations for computing the iteration count. Taking a 
look at Fig. 6.6, we see that the polarization-based schemes outperform 
the (Quasi-)Newton methods and the basic scheme. A2DR is fastest, 
followed by Monchiet-Bonnet’s method whose run-time is 35% higher. 
Both, Newton-CG and Barzilai-Borwein (BB), perform similarly, taking 
roughly twice as long as A2DR to finish. 

In conclusion, we see that Anderson acceleration further improves the 
performance of the already powerful polarization-based methods for 
finitely-contrasted materials. A depth of m = 4 emerges as a reasonable 
choice for both linear and nonlinear problems. Note that the improved 
performance of the Anderson-accelerated polarization-schemes comes 
at a price. Using A2DR with a depth of m = 4 requires the storage of 10 
strain-like fields, not counting internal variables. Compared to 1 strain- 
field for the basic scheme, 2 for the Barzilai-Borwein method and 8.5 for 
Newton-CG (when storing the tangent-field), this represents a rather 
large memory footprint. This is exacerbated by the fact that polarization- 
based schemes do not permit a displacement-based implementation, 
which can roughly half the memory requirements of the aforementioned 
strain-based methods. 
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Figure 6.6: Short glass-fiber reinforced polymer - Performance comparison for various 
solution schemes 


Table 6.3: Sand-core - Material parameters of sand grains and binder (Daphalapurkar et al., 
2011; Wichtmann and Triantafyllidis, 2010; Sanditov et al., 2009) 


Quartz sand grains ŒE = 66.9 GPa, v = 0.25 
Quartz sand binder E = 71.7 GPa, v = 0.17 


6.3.4 A Sand-core microstructure 


For this example, we investigate a sand-core microstructure, discretized 
by 256° voxels. The structure consists of 64 sand grains with a volume 
fraction of 58.58% held together by an inorganic binder with 1.28% vol- 
ume fraction, see Fig. 6.7. For a detailed treatment of the microstructure 
generation and the linear elastic properties of the material, we refer to 
Schneider et al. Schneider et al. (2018). The material parameters of the 
constituents are listed in Tab. 6.3. 

The sand-core microstructure represents a porous material for which p is 
usually unknown (Schneider, 2020b). Using the natural estimate u = 0 


203 


6 Anderson-accelerated polarization schemes for FFT-based homogenization 


(a) Microstructure (b) 2141 = 1% 


Figure 6.7: Sand-core structure - Microstructure and von Mises strain-field for uniaxial 
extension in x-direction 


makes the step size y = 1//uL, which is optimal for the polarization- 
based schemes, not applicable. Hence, in numerical experiments, 
their performance for solving this type of problem was found to be 
poor (Schneider, 2019a), exhibiting even lower convergence rates than 
the basic scheme. In contrast to fast gradient solvers (Schneider, 2017a; 
2020a) or (Quasi-)Newton methods (Gélébart and Mondon-Cancel, 
2013; Schneider, 2019a; Wicht et al., 2020b), this has prevented using 
polarization-based schemes as general-purpose algorithms. 

In this context, we investigate whether A2DR can increase the effi- 
ciency of polarization-based schemes to competitive levels. To this end, 
the sand-core microstructure is subjected to 1% uniaxial extension in 
x—direction and we solve up to a tolerance of 6 = 10~'° to determine 
the convergence rate of the algorithms. We fix a = 0.25 and consider the 
available step sizes for u = 0, i.e., the conservative choice y = 1/L and 
the optimum step size of the basic scheme y = 2/(u + L), i.e., y = 2/L 
for ps = 0. 

As for finitely contrasted media, Anderson acceleration substantially 
improves the performance for the investigated step-size choices, see 
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Figure 6.8: Sand-core structure - Residual vs iteration 


Fig. 6.8a. In agreement with the results by Schneider Schneider (2019a), 
Monchiet-Bonnet’s method with the step size of the basic scheme slows 
down considerably over the course of the computation and fails to 
converge within 1000 iterations. In contrast, A2DR with the same step 
size exhibits a linear convergence rate, requiring less than 300 iterations 
to reach the prescribed tolerance. Using y = 1/L roughly triples iteration 
counts and run-times. 

Fig. 6.8b reveals that, using Anderson acceleration, polarization-based 
schemes become competitive to the fastest available FFT-based solvers 
for porous media. More precisely, A2DR(m = 4, y = 2/(u + L)) ends up 
just between CG and the Barzilai-Borwein method in terms of iteration 
count, requiring 25% more than the former and 25% less than the latter. 
In terms of overall performance, it matches the Barzilai-Borwein method, 
with both methods running about 30% longer than CG. 


To summarize, Anderson acceleration makes the computational power 
of polarization-based schemes available for treating porous microstruc- 
tures, considerably broadening their range of application. The optimum 
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Figure 6.9: Metal-matrix composite - Microstructure and von Mises strain-field for a cyclic 
uniaxial stress loading 


step size of the basic scheme y = 2/(1 + L) emerges as a decent general- 
purpose choice for materials with both finite and infinite contrast. 


6.3.5 Metal-matrix composite under cyclic loading 


For our next example, we investigate the cyclic behavior of a metal- 
matrix composite (MMC). The microstructure consists of 50 spherical 
ceramic particles with a volume fraction of 30% embedded in a metal 
matrix, see Fig. 6.9a. For the particle placement, we relied on the 
random sequential addition algorithm (Widom, 1966) and the resulting 
microstructure was resolved by 128° voxels. 

The ceramic inclusions are assumed to be linear elastic, whereas the 
material behavior of the matrix is governed by a Js-elastoplasticity 
model with kinematic hardening, see, for instance, Chaboche (1989; 
2008). 

For simplicity, we neglect isotropic hardening, i.e., we model the yield 
stress ay to be independent of the equivalent plastic strain p. The 
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governing equations for the model are given by Hooke’s law 
o=C:(e-e) with e=ec+&, (6.41) 


the associated flow rule 


i 0 f 3 
p=% with 4(0,X) = Sir: :(o-Xl-ov, (642) 
Oo 2 
the Karush-Kuhn-Tucker conditions 
lo, X) <0, 720, yo(o,X) =0, (6.43) 


and a kinematic hardening law, defining the evolution of X, i.e., the 
center of the elastic domain. 

Over time, a multitude of formulations has been proposed for the latter, 
see, for instance, the reviews by Abdel-Karim (2005) or Kang (2008) for 
an overview. For the present study, we choose the kinematic hardening 
law by Chaboche et al. (1979) 


M 
A255. Š= Shit, — (Xp, (6.44) 

i=1 
where X is decomposed into multiple parts, following the classical 
Frederick-Armstrong rule (Frederick and Armstrong, 2007). Using a 
backwards Euler time discretization, we rely on the fixed-point algo- 
rithm by Kobayashi and Ohno (2002) for the implementation of the 
material model. Note that the tangent stiffness for this model is not sym- 
metric (Kobayashi and Ohno, 2002). For using the Newton-CG method, 
Kobayashi and Ohno (2002) suggest using the symmetrized tangent 
when solving the linear system. Following their recommendation, we 
also estimate u based on the symmetrized tangent, whereas L is fixed by 


207 


6 Anderson-accelerated polarization schemes for FFT-based homogenization 


the elastic stiffness of the materials. The material parameters for both 
constituents are listed in Tab. 6.4. 


Table 6.4: Metal-matrix composite - Material parameters of matrix (Kobayashi and Ohno, 
2002) and ceramic particles (Segurado et al., 2002) 


Ceramic particles Æ = 400 GPa v=0.2 


Metal matrix E = 165 GPa v= 0.3 
oy = 240 MPa 
Cı = 2000 Ca = 500 
¢3 = 200 Ca = 50 


hy = 100 GPa ha = 25 GPa 
h3=8GPa  h,=5GPa 


We consider a cyclic uniaxial stress loading with a mean stress value 
of 100 MPa and an amplitude of 300 MPa. The loading is applied 
over 4 cycles, discretized by 30 equidistant steps per cycle. Each step 
is solved up to a tolerance of ô = 10-5. Note that, in contrast to 
Sec. 6.3.3, the prescribed hardening law leads to a stress operator which 
is neither derived from a potential nor strictly monotone. Indeed, 
in our numerical experiments, the lower bound of the tangent field 
quickly approached zero during plastification, preventing the use of 
the optimum step size y = 1/./wL. Hence, in combination with the 
non-monotonic loading, the metal-matrix composite with kinematic 
hardening constitutes a challenging benchmark, which is not covered by 
the theoretical treatment of Sec. 6.2. 

For evaluating the performance of the solvers, we fix the algorithmic 
parameters of A2DR to a = 0.25, y = 2/(u+ L) and m = 4 based on 
the results of the previous sections. Comparing the iteration counts 
and run-times of the different FFT-based solution schemes in Fig. 6.10, 
we see that A2DR performs admirably. It requires the lowest number 
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Figure 6.10: Metal-matrix composite - Performance comparison for various solution 
schemes 


of total iterations and ties with the Barzilai-Borwein method for the 
fastest computation time. Monchiet-Bonnet’s method, with the same 
algorithmic parameters as A2DR (except for m = 0), takes more than 
twice as long to converge, with an overall performance comparable to 
the Newton-CG method. Note that the effect of Anderson acceleration, 
while still significant, is less pronounced compared to the numerical 
experiment on a porous structure in Sec. 6.3.4. This is due to the cyclic 
loading, where plastic flow in the matrix material is only activated 
for high stress magnitudes, see Fig. 6.9b. In the elastic loading and 
unloading steps, all solvers converge in a single iteration owing to the 
affine-linear extrapolation. Hence, the superior performance of A2DR is 
only realized in roughly half of all load steps. For the same reason, the 
basic scheme performs comparatively well for this problem, with a final 
iteration count just over three times higher than A2DR. 
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Figure 6.11: NiAl-9Mo - Microstructure and creep behavior 


6.3.6 Directionally solidified NiAl-9Mo 


For our final example, we turn to a directionally solidified NiAl-9Mo 
eutectic alloy used as a benchmark problem in Sec. 4.6.3. The consid- 
ered microstructure with 84 unidirectionally aligned fibers with square 
cross-section was generated using the random sequential addition al- 
gorithm (Widom, 1966) and resolved by 1200 x 160 x 160 voxels. The 
fibers have an aspect ratio of 100 (Haenschke et al., 2010) and take up 
14% of the overall volume. Following Albiez et al. (2016a), the behavior 
of fibers and matrix are governed by a single-crystal elastoviscoplasticity 
model based on Hooke’s law 


o=C:(e-e,), with e=e+é (6.45) 


and the classical power-law flow rule by Hutchinson (1976) 


N 
bp = Y jada B na with Ya = sosgn(ra) 


a=l 


Ta 


Fo (6.46) 


where da and na denote the slip direction and the slip-plane normal, 
respectively, jo is the reference slip-rate and 7" is the yield stress. The 
operator ®° denotes the symmetrized dyadic product, i.e., dg ®* Nna = 


210 


6.3 Numerical demonstrations 


Table 6.5: NiAl-9Mo - Material parameters of fibers and matrix at 1000°C Albiez et al. 


(2016a) 
Molybdenum fiber Nickel-aluminum matrix 
Stiffness Ci, = 404 GPa Ci, = 182 GPa 
C2 = 163 GPa Ci. = 120 GPa 
C44 = 99 GPa C44 = 85.4 GPa 
Flow rule yo = 8.965! Yo = 1073 s71 
n = 10.5 n = 4.04 
Hardening Too = 3833 MPa TE = 30.75 MPa 
d = 0.729 um 
Po = 9 x 10? mm? 
ps = 2.3 x 10° mm? 
k2 = 66 
Lattice type BCC B2 
Slip systems {110}(111) {001} (100) 
{112}(111) £011}(100) 
{123}(111) {011}(110) 


(da D Na + Na ® da) /2 and the shear stress in a slip system is computed 
by Ta = 7 : da QÊ Nna. The matrix is assumed to behave perfectly plastic, 
i.e., the yield stress is constant t" = 7}. For the molybdenum fibers, 
Albiez et al. (2016a) proposed the softening law 


F Too 


in terms of the dislocation density p, with maximum yield stress 7% 
and a characteristic length parameter d. The authors used the storage- 
recovery model by Kocks and Mecking (2003) for the evolution of the 
dislocation density, which permits expressing p as an explicit function of 
the accumulated plastic slip 7 = > Ya 


1 2 
P = Ps h exp ( 321) (1- 2) ; 
Ps 


(6.48) 
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upon integration. The parameters of the materials are listed in Tab. 6.5. 
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(a) Distribution of von Mises stresses in the 


(b) Distribution of von Mises strain in the fibers 
matrix 


Figure 6.12: NiAl-9Mo - Von Mises stress distribution at different stages of a uniaxial creep 
test in fiber direction 


Evaluating the material law £ + o(e) for single-crystal elastoviscoplas- 
ticity is computationally expensive compared to other operations such 
as applying T° and the associated FFTs, see Eghtesad et al. (2018a) or 
Sec. 3.4.4. Typically, Newton methods enjoy the best performance for 
this type of problems, as the material law is evaluated only once per 
Newton iteration, whereas applying the material tangent is substantially 
cheaper. This is why a performance comparison with A2DR is of interest 
for this computationally demanding problem. 

We consider a creep test, where a uniaxial stress loading with an am- 
plitude of 250 MPa is applied in a single load step for one second. 
Subsequently, the loading is held constant for 120 seconds, subdivided 
into 120 equidistant load steps. Throughout, we solve to an accuracy of 
6 = 10~°. The number of load steps was chosen to obtain a sufficiently 
fine resolution of the strain rate over time and to ensure the positive 
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Figure 6.13: NiAl-9Mo - Von Mises stress-field at different stages of a uniaxial creep test in 
fiber direction, showing the initial load transfer from fibers to matrix and the subsequent 
fiber softening 


definiteness of the tangent stiffness, which is not guaranteed due to the 
softening law (6.47). 

The evolution of the local fields during the creep test, see Fig. 6.12- 
Fig. 6.14, illustrates the influence of the softening law (6.47) by Albiez 
et al. (2016a) on the effective material behavior. Owing to their high 
initial yield strength, the molybdenum fibers exhibit no plastic activity 
during the initial creep stage, behaving almost linear elastically. In 
contrast, the matrix plastifies almost immediately upon applying the 
initial stress loading. The viscous stresses in the matrix, caused by the 
high strain-rate during the initial loading, are subsequently transferred 
to the fibers. In turn, this leads to a further decrease of the overall strain 
rate, due to the higher creep resistance of the fibers. After roughly 30 
seconds, the creep rate reaches its minimum, see Fig. 6.11a. At this point, 
the stress in the fibers has increased up to a level where plastic slipping is 
initiated. With the change from elastic to plastic behavior, the softening 
law takes effect, resulting in a subsequent increase of the effective creep 
rate and decreasing stress levels in the fibers. 

The different stages of creep behavior are reflected in the iteration counts 
of the solvers, see Fig. 6.15 and Fig. 6.11b. During the first few load steps, 
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Figure 6.14: NiAl-9Mo - Accumulated plastic slip at different stages of a uniaxial creep test 
in fiber direction, showing the transition from elastic to plastic behavior in the fibers 


a high number of iterations is required, due to the rapid load transfer 
from matrix to fibers. As the creep rate stabilizes, the affine linear 
extrapolation takes effect and the iteration count per load step reaches a 
minimum roughly between step 30 and 60. Subsequently, the softening 
of the fibers causes an increase in the effective creep rate and leads to 
a higher material contrast. Thus the computational effort increases in 
the last 60 load steps. Comparing the investigated solution schemes, 
we observe that A2DR with the optimum step size y = 1///uL closely 
matches the performance of the Newton-CG method. To be precise, 
A2DR is slower in the first 20 load steps and enjoys a slight advantage 
afterwards. Roughly around load step 70, both schemes break even in 
terms of total computation time. In the end, A2DR is even slightly faster 
overall than Newton-CG. Using the more widely applicable step size 
y = 2/(u + L) for A2DR doubles the total run-time compared to the 
optimum step size. Still, the slower option is about 30% faster than the 
Barzilai-Borwein method. 
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Figure 6.15: NiAl-9Mo - Performance comparison for various solution schemes 


6.4 Conclusions 


The present study was devoted to increasing the robustness and per- 
formance of polarization-based methods in FFT-based micromechanics, 
by applying Anderson acceleration, a general-purpose technique for 
accelerating fixed-point methods. 

To demonstrate the usefulness of the proposed algorithm, we covered a 
wide spectrum of problems, including microstructures and material laws 
of varying complexity. To be more precise, in Sec. 6.3.2 and Sec. 6.3.3, 
we investigated finitely contrasted fiber-reinforced microstructures with 
elastic and Js-elastoplastic material behavior, which were covered by 
the theoretical treatment in Sec. 6.2.1. For this class of problems, the 
excellent performance of polarization-based methods, see Schneider et al. 
(2019) and Schneider (2019a), could be further improved using Anderson 
acceleration. With respect to the choice of algorithmic parameters, we 
found that, using a depth of m = 4, the influence of the damping 
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parameter could be eliminated and the sensitivity with respect to the step 
size was drastically reduced. Whereas the theoretically optimum step 
size y = 1/\/uL led to the best performance, when applicable, the step 
size of the basic scheme y = 2/(u + L) emerged as a viable alternative. 
In particular, when the strong convexity constant u is unknown or tends 
to zero, y = 2/L can be readily estimated from the elastic stiffness of the 
constituent materials. 

This enabled us to investigate examples outside the framework of 
strongly convex optimization, where polarization-based schemes typi- 
cally struggle. For the porous sand-core structure in Sec. 6.3.4, a lower 
bound p of the elastic stiffness was unavailable. In the problems of 
Sec. 6.3.5 and Sec. 6.3.6, we considered computationally demanding 
material models which do not permit a potential-based formulation. 
Both cases incorporated material laws without a strictly monotone stress 
operator, with the latter example even including softening behavior. For 
all of these problems, Anderson acceleration led to substantial speed-ups 
compared to the classic polarization-based schemes, with A2DR being 
competitive to the fastest strain-based FFT-solvers. 

Indeed,to optimize performance in FFT-based micromechanics, a judi- 
cious choice of the solution scheme is often inevitable. For instance, 
CG is the natural choice for linear elastic problems, inexact Newton- 
CG is hard to beat if the cost of evaluating the material law is much 
larger than applying the tangent and the Barzilai-Borwein method is 
excellent if the material law is cheap to compute, see Ch. 3. In this 
study, we demonstrated that A2DR closely matches (or even beats) the 
performance of these schemes in each of their "ideal" settings. Thus, 
A2DR represents a robust and powerful solution scheme, which is close 
to optimal for a wide range of problems. 

However, the excellent performance of the method is still accompanied 
by a large memory footprint. Future work may be devoted to investi- 
gating alternative vector-sequence acceleration techniques (Ramiere and 
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Helfer, 2015; Brezinski et al., 2021), seeking methods with lower memory 
requirements which preserve the advantages of Anderson acceleration. 
Last but not least, the efficiency of polarization-based methods relies 
on the cheap evaluation of the nonlinear Z°-operator. Extending the 
complexity-reduction technique of Schneider et al. (2019) to a wider 
class of materials would further increase the usefulness of A2DR as a 
general-purpose method. 
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Chapter 7 


On the impact of the 
mesostructure on the creep 
response of cellular NiAl-Mo 
eutectics' 


71 Introduction 


Directionally solidified NiAl-Mo eutectics, consisting of well-aligned 
single-crystalline Mo-fibers embedded in a NiAl matrix (Bei and George, 
2005), are an appealing candidate for structural high-temperature ap- 
plications (Darolia, 1991). However, several studies (Misra et al., 1998; 
Haenschke et al., 2010; Seemüller et al., 2013) demonstrated that the 
microstructure of the alloy is rather sensitive to the manufacturing 
process. In particular, insufficient temperature gradients and/or high 
growth rates, which are desirable from the viewpoint of industrial 
application, lead to deviations from an ideal microstructure of perfectly 
aligned Mo-fibers in the NiAl matrix. On the mesoscale, NiAl-Mo 
develops cellular structures (Misra et al., 1998; Seemiiller et al., 2013) 
in which regions of well-aligned fibers are surrounded by degenerated 


1 This chapter is based on Wicht et al. (2022). For the sake of a coherent structure, 
formatting and typography of this thesis, minor changes have been made. To avoid 
redundancies in the text, the introduction has been shortened. 
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regions with higher NiAl fraction and coarse, misaligned Mo fibers, see 
Fig. 7.1. Indeed, Gombola et al. (2020) revealed that similar structures 
emerge for various compositions in the NiAl-(Mo,Cr) system. 


Seemüller et al. (2013) showed that cell formation results in a lower 
creep resistance, between well-aligned NiAl-Mo and binary NiAl. The 
ability to model and predict the creep behavior of cellular NiAl based 
composites appears crucial, as: (i) Perfect laboratory conditions for 
producing NiAl-based eutectics may not always be available in an 
industrial context where high growth rates are preferred. (ii) The process 
conditions to achieve perfect alignment become challenging in case of 
advanced complex alloying compositions with extended solidification 
intervals (Gombola et al., 2020). The applied temperature gradients 
need to cover the solidification interval in the transition zone from the 
liquid to the solids in order to obtain stable processing conditions during 
solidification. Thus, cellular microstructures become more likely under 
practical conditions. Determining the impact of partially interrelating 
morphological features, such as cell volume fraction and aspect ratio, on 
the mechanical behavior is necessary, not only to assess the sensitivity 
of the overall creep response to microstructural irregularities, but also 
for identifying suitable processing conditions and alloy compositions. 
Finally, a rather large disparity on reported experimental results, for 
example regarding the apparent stress exponent of the composite (Albiez 
et al., 2016a; Dudovä et al., 2011; Hu et al., 2013; Seemiiller et al., 2013), 
might indicate that the mesostructure of the material has already played 
a role in some of the previous studies as will be highlighted in Sec. 7.2.2 
and Sec. 7.4.4. 

Thus, the aim of the present study is to investigate the creep behavior 
of cellular NiAl-Mo through creep simulations on the microscale. To 
this end, we use modern FFT-based methods (Moulinec and Suquet, 
1998), which have established themselves as powerful algorithms for 
computing the effective response of microstructured materials, such as 
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Macroscale Mesoscale Microscale 


Figure 7.1: Structure of directionally solidified cellular NiAl-Mo sketched at different 
length scales based on dark field optical microscopy images by Seemiiller et al. (2013) 


composites (Burgarella et al., 2019; Wang et al., 2018a) and polycrystals 
(Lebensohn et al., 2012; Eisenlohr et al., 2013). In the context of mi- 
cromechanical creep simulations, the effective strain-rate is computed by 
volume averaging the strain-rate field on the microstructure level, which 
arises in response to a prescribed mean stress. The main difficulty for 
this task lies in the multi-scale nature of the problem, i.e., the difference 
in the characteristic length scales of the different geometric features of 
the material, see Fig. 7.1. While the cellular colonies are roughly 1 mm 
and 0.2mm in length and diameter, respectively, the diameter of the 
Mo fibers is in the sub-micron scale. Hence, if a volume element with 
multiple cells is considered for simulating the creep behavior, resolving 
the individual fibers will be infeasible. Instead, we follow Seemiiller 
et al. (2013) and divide the material on the mesoscale into soft regions for 
the boundary, behaving similar to the NiAl-matrix, and homogeneous 
hard regions, mirroring the effective creep behavior of the well-aligned 
NiAl-Mo colonies. 
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In order to bridge the different length scales involved in the simulations, 

we proceed with the following steps: 

1. A phenomenological surrogate model for the anisotropic creep re- 
sponse of the single well-aligned colonies is calibrated based on 
crystal plasticity simulations following Albiez et al. (2016a) in Sec. 7.2. 

2. Synthetic microstructures, mirroring the geometrical features of cel- 
lular NiAl-Mo, are generated based on the level set framework of 
Sonon et al. (Sonon et al., 2012; Sonon, 2014; Sonon et al., 2015), see 
Sec. 7.3. 

3. Having gathered all necessary prerequisites, the effective creep behav- 
ior of cellular NiAl-Mo is investigated through FFT-based microme- 
chanics simulations in Sec. 7.4. 


7.2 Modeling the anisotropic creep behavior 
of well-aligned NiAl-Mo colonies 


7.2.1 Single crystal plasticity model for fiber and matrix 


In the following, we briefly review the material models and parame- 
ters (Albiez et al., 2016a) used for characterizing the anisotropic creep 
response of well-aligned NiAl-Mo. The material behavior of the NiAl- 
matrix and the Mo-fibers is governed by a classical small-strain single- 
crystal elasto-viscoplasticity model. In the following, € denotes the 
infinitesimal strain tensor and ø refers to the Cauchy stress tensor. The 
linear elastic material behavior is governed by Hooke’s law for the elastic 
strains Ee 

o=UC:(e-g) with e=e.+€ (7.1) 


and the stiffness tensor C. The plastic strain £p due to dislocation glide 
is realized as a linear combination of simple shears (Bishop, 1953) in 
crystallographic slip systems characterized by their slip direction da and 
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slip plane normal na, where the subindex (-), refers to the ath of N slip 
systems. Assuming that the slip in the glide systems follows the classical 
power-law flow rule of Hutchinson (1976), the flow rule reads 


Ta m 5 
= da ® na, (7.2) 


N 

Ep = 5 40 Sgn (Ta) 

a=1 

with shear stress Ta = o : (da ®* na), yield stress T}, reference slip 
rate Yo and stress exponent m. We emphasize that the chosen flow 
rule only covers plasticity due to conservative dislocation glide. More 
sophisticated models which include the smaller strain contribution 
of dislocation climb by adding additional non-conservative modes of 
deformation have been proposed, for instance, by Lebensohn et al. (2010). 
However, as Albiez et al. (2016a) demonstrate, the chosen approach 
(7.2) is able to predict the creep behavior of NiAl-Mo for tempera- 
tures between 900°C and 1000°C and stresses between 100 MPa and 
250MPa with good accuracy. Furthermore, the stress exponents m 
of the monolithic phases as well as of the composite are significantly 
larger than 1 indicating that diffusional contributions to the overall 
strain are negligible. Hence, to avoid the introduction and calibration 
of additional unknown material parameters, we restrict to the glide 
based formulation. Furthermore, the temperature dependence of the 
creep behavior is incorporated in the reference shear rate by Albiez 
et al. (2016a), using an Arrhenius approach. As an exemplifying study, 
we compare our modeling results mainly with the experiments by 
Seemiiller et al. (2013), who carried out creep tests at 900°C. All material 
parameters, experimental data and simulation results in the this study 
are given for this fixed temperature. In addition, experimental results 
show that the softening of the Mo-fibers, i.e., the decrease of TF during 
creep, is only weakly pronounced in the cellular material, see Fig. 5 in 
Seemüller et al. (2013). Computational investigations suggest that, even 
for the well-aligned material, substantial softening only occurs for direct 
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loading in fiber direction, see Sec. 4.6.3. For a thorough investigation of 
this load case and a physical interpretation of the softening, we refer to 
the studies of Albiez et al. (2016a; 2019). As the present study focuses on 
cellular NiAl-Mo, we restrict to investigating the steady-state creep rate 
of the materials, i.e., we treat T" as constant. 


Table 7.1: Material parameters for NiAl and Mo at 900°C (Albiez et al., 2016a; Seemüller 
et al., 2013) 


NiAl Ci, =184GPa = Ci2 = 121 GPa C44 = 88.1 GPa 
Të =30.75 MPa 4=845x10-§s-' m=5.8 
Mo-Fibers Cı =410GPa Cı2= 163 GPa C44 = 100 GPa 


T? =3751MPa  ¢ọo=3.43 x 1071s7! m=10 


The material parameters for NiAl and the Mo-fibers are mostly taken 
from Albiez et al. (2016a), see Tab. 7.1. By comparing the yield strength 
7F of the two materials at 900°C, the difference in creep resistance 
becomes apparent. Due to the directional solidification process, the 
Mo-fibers are virtually free of dislocations (Bei et al., 2008; Sudharshan 
Phani et al., 2011), leading to a high yield strength of roughly 3% of 
the shear modulus of Mo. Based on an extensive literature review, 
Albiez et al. (2016a) were able to adopt most material parameters from 
existing sources. Indeed, among the relevant parameters for the present 
study, only the reference shear rate of the Mo-fibers was calibrated to 
match simulation results (Albiez et al., 2016a). However, as the study 
of Seemüller et al. (2013) represents our primary point of comparison, 
we adopt two additional changes with respect to the parameters of NiAl. 
More precisely, we choose m = 5.8 as measured by Seemiiller et al. 
(2013), compared to 4.04 used by Albiez et al. (2016a). Indeed, a large 
range of values from 3 to 7 has been reported for the stress exponent 
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of single-phase NiAl in the literature (Noebe et al., 1993). The large 
scatter in experimental measurements may be due to the sensitivity of 
the stress exponent of near stoichiometric NiAl on composition, as noted 
by Whittenberger (1987). To compensate for the change in the stress 
exponent, the value of jo was modified to reach a decent agreement 
between modeled material behavior and experimental measurements, 
see Fig. 7.2c. 


7.2.2 Minimum creep rate of well-aligned NiAl-Mo un- 
der various loading angles 
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Figure 7.2: Microstructure and creep behavior of well-aligned NiAl-Mo 
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For characterizing the material behavior of well-aligned NiAl-Mo, we 
use the two-dimensional cell shown in Fig. 7.2a, with 100 fibers occupy- 
ing 14% of the total area (Bei and George, 2005). Distinct microstructural 
features of well-aligned NiAl-Mo include the square cross-section of 
the Mo-fibers and their regular arrangement in a hexagonal pattern, 
see Fig. 1 in Bei and George (2005) or Fig. 3 in Seemiiller et al. (2013). 
To generate a similar hexagonal arrangement, we use the mechanical 
contraction algorithm of Williams and Philipse (2003) to generate a 
circle packing with 70% volume fraction. Subsequently, square fibers 
of appropriate size are placed at the centers of the packed circles. The 
resulting structure is discretized by 256 x 256 pixels. For investigating the 
anisotropic creep behavior of the material, we apply periodic boundary 
conditions and prescribe the effective stress tensor 7, i.e., the volume 
average of the stress field. More precisely, the prescribed effective stress 
tensor has the form ¢ = ø d ® d corresponding to a uniaxial stress state 
with magnitude o and loading direction d. The loading is applied in 1s 
and held until a steady-state strain-rate is reached. Different loading 
directions d are tested with respect to their angle of misalignment to the 
growth direction, see Fig. 7.2b for a sketch. Details on the computational 
setup of the FFT-based micromechanics solver are given in Sec. 7.4.1. 
The computed minimum creep rate of the well-aligned material for 
loadings in growth direction at various stress levels is compared to 
the creep experiments by Seemiiller et al. in Fig. 7.2c. Although the 
data from simulation and experiment are in decent agreement in the 
range from 150 — 200 MPa, the slopes, i.e., the apparent stress exponents, 
differ notably, with m = 10 in the simulations compared to values 
of 5 to 7 in Seemüller et al. (2013). This indicates that the Mo-fibers 
control the creep behavior of the single colony in the simulation. A 
broader review of existing creep studies reveals that there is, in fact, no 
clear consensus on the stress exponent of well-aligned NiAl-Mo. For 
instance, creep experiments by Haenschke et al. (2010), Albiez et al. 
(2016a) and Dudova et al. (2011) displayed fiber-dominant behavior 
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with m between 10 and 14. In contrast, Seemüller et al. (2013) and Hu 
et al. (2013) measure an exponent in the range of 4 — 7. Taking a closer 
look at the anisotropic creep behavior predicted by the microstructure 
computations, see Fig. 7.3, elucidates the disparity in experimental 
measurements. Fig. 7.3a reveals the pronounced sensitivity of the creep 
behavior with respect to the angle of misalignment between loading 
and fiber direction. For loading angles larger than 5°, the creep rate 
quickly increases by orders of magnitudes. Indeed, between 15° and 
30°, the reinforcing effect of the fibers mostly vanishes and the creep 
rate approaches that of the pure NiAl matrix. A more subtle change in 
behavior can be observed at small angles of misalignment, see Fig. 7.3b. 
Between 0° to 2°, we observe no change in creep behavior and the 
apparent stress exponent corresponds to that of the Mo-fibers. However, 
between 3° to 4°, there is a turning point from fiber-controlled to matrix- 
controlled creep, with little change in the overall magnitudes of creep 
rates (at least between 150 — 200 MPa). This offers a possible explanation 
for the wide range of determined stress exponents in the aforementioned 
experimental studies, as a small misalignment with respect to the loading 
direction has a notable impact on the measured rates. In Sec. 7.4.3, we 
identify the mesostructure of the material as another plausible source 
for the scatter in stress exponents. 

Overall, we conclude that the material model and parameters by Albiez 
et al. (2016a) lead to a good agreement of micromechanical simulations 
with experimental results, in particular when taking the sensitivity of the 
material behavior with respect to load angle into account. Indeed, the 
creep data by Seemiiller et al. (2013) matches the computational results 
for a loading angle of 4° almost perfectly, see Fig. 7.2c and Fig. 7.3b. 
Having validated the model and computations on the microscale, we 
use the obtained results to calibrate a surrogate model, mimicking the 
effective behavior of the well-aligned fibrous material. 
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Figure 7.3: Comparison of FFT-based simulations on well-aligned NiAl-Mo microstruc- 
tures and the surrogate model (7.8) for uniaxial creep tests where the loading angle is 
given with respect to growth direction 


7.2.3 Phenomenological model for the well-aligned fiber 
structure 


The objective of the section at hand is to develop a simple phenomenolog- 

ical elasto-viscoplastic material model which is able to capture the creep 

behavior observed in Sec. 7.2.2. In particular, the following properties 

should be reflected by the model: 

1. The transverse isotropy of both stiffness and flow rule, induced by 
the microstructure. 

2. The directional dependence of the apparent stress exponent, resulting 
from the difference in fiber and matrix behavior. 

The effective linear elastic behavior is governed by Hooke’s law (7.1), 

where the components of the effective stiffness tensor are readily ob- 

tained by six linear elastic computations. The computed stiffness tensor 

is almost transversely isotropic, with a relative error below 0.1%. The 

associated engineering constants are listed in Tab. 7.2. For the flow rule, 
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we rely on the transversely isotropic splitting of the deviatoric stress 
tensor by Naumenko and Altenbach (2005) 


ao = oj + op + o$ (7.3) 


into a longitudinal component o7, the plane stress op and the remaining 
out-of-plane shear stress o¢, defined by 


= & :(n®n) — (o) (ns n— 51) ; (7.4) 

op = (I-n @n)-a-(I-n@n) 5 (tr(o) o:(n®n))I-n®n), 
(7.5) 

os=2(n-o-(I-n®n)) ®°n, (7.6) 


respectively. Here, n denotes the unit normal of the isotropic plane, 
i.e., the fiber direction. Naumenko and Altenbach (2005) show that 
the Frobenius norms |lo7, ||, ||o‘p|| and ||o5|| of the stress components 
constitute a set of independent, transversely isotropic invariants of o’. 
Thus, for any flow potential of the form ®(0’) = B((|o%, ||, loll, loll), 


the associated flow rule ép = $%(0’) is transversely isotropic. For the 


present model, we use the simple ansatz 


F yy ymL+l 
a  |lorll 
G / =¢ E L 
(0) = & (= +1 | of 
F ry jmr+1 F yyy |ms+1 (27) 
ap __|llopll os |llosl 
mp+1 o ms +1 og i 
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which leads to the flow rule 


a Nl" o of || o! o! ms o! 
eolo’) =o (JUJ t a [IEL ee a IR N, 
E| Toth | of | Tel | of | Tosi 
(7.8) 


i.e., each stress component o! , op, og has an associated yield stress of, 
ob, of and the stress exponent mi, mp, ms, respectively. In contrast to 
the equivalent-stress approach by Naumenko and Altenbach (2005), the 
present formulation is able to accommodate different stress exponents 
for longitudinal and in-plane loadings. On the downside, our flow rule 
does not reduce to the classical J2-plasticity model for a specific choice 


of parameters. 

The material parameters for the flow rule, see Tab. 7.2, were calibrated 

by performing a creep test in fiber direction and two shear-creep tests. 

The resulting creep behavior of the surrogate model is compared to 

the crystal plasticity computations of Sec. 7.2.2 in Fig. 7.3. Overall, the 

surrogate model matches the simulations exceptionally well. Both, the 
deterioration of creep resistance for off angle loadings, see Fig. 7.3a, and 
the transition from fiber to matrix-dominated creep at small angles, see 

Fig. 7.3b, are reproduced with high accuracy. Overall, the surrogate 

model is suitable for facilitating computational investigations on cellular 

NiAl-Mo on the mesoscale. However, some remarks on the limitations 

of the model are in order: 

1. The largest relative error in strain-rates between surrogate model and 
micromechanical crystal plasticity simulation is around 25% for load- 
ings perpendicular to the fibers. This is acceptable for investigations 
of the creep behavior, where creep rates are typically visualized on 
a logarithmic scale and experimentally determined creep rates may 
scatter up to an order of magnitude. However in other contexts, e.g., 
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Table 7.2: Material parameters for the surrogate model, mimicking the creep behavior of 
unidirectional NiAl-Mo with 14% fiber fraction 


Stiffness EL=120.6GPa Er=181.9GPa Gir=89.7GPa 


vrr= 0.015 ur = 0.379 
Creep of =625MPa of =153.5MPa of =154.5MPa 
mz = 10 mp =5.8 ms =5.8 
o =0.01s~1 


for predicting the non-linear stress-strain behavior, the model may 
have to be reviewed, or, at least, carefully recalibrated. 

2. Both simulations Albiez et al. (2016a; 2019) and experiments Dudova 
et al. (2011); Hu et al. (2013); Seemüller et al. (2013) on well-aligned 
NiAl-Mo show a transient decrease of the creep rate in the initial 
stages of a creep test, owing to the load transfer from fibers to matrix. 
Naturally, the surrogate model cannot account for this behavior as 
the constituent phases are not explicitly resolved. 
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7.3 Generating synthetic cellular mesostruc- 
tures 


ES 


(a) Transverse cross section 


(b) 3D overview 


Figure 7.4: Different stages of the microstructure generation process with the underlying 
fiber structure (left), the Voronoi level set (7.9) of the center lines (middle) and the final cell 
structure (right) 


For NiAl-10Mo alloys solidified at a rate of 80 mm/h, Seemiiller et al. 
(2013) observed that regions of well-aligned unidirectional fibers formed 
cellular structures on the meso-scale, surrounded by misaligned fibers 
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and pure matrix material. The cells, featuring roughly hexagonal cross- 
sections, were elongated in the direction of solidification, with lengths 
of around 1000 um and an aspect ratio of five. Based on a cell distance 
between 6 — 10 um, Seemiiller et al. (2013) estimated a volume fraction 
of 82 — 85% of the hard regions. 

For generating synthetic volume elements, mimicking the aforemen- 
tioned characteristics, we rely on the level-set-based framework of Sonon 
et al. (Sonon et al., 2012; Sonon, 2014; Sonon et al., 2015). In the following, 
the basic methodology is briefly summarized for the convenience of 
the reader. Suppose we have a rectangular cell Y in IR? with a set of 
non-overlapping particles ® = DHE ®;. Sonon et al. propose an implicit 
description of the microstructure in terms of the nearest neighbour level 


set 
min d(x,y), z éð, 
DN (£) = yeo®n 
in —d ®, 
mii (x,y), wea, 


where d(x, y) denotes the periodic distance of two points x,y € Y and 
0® stands for the boundary of the set ®. Thus, the condition DN; (x) < 0 
describes the space occupied by particles. As an extension to DN; (x), 
the level sets DN,.(x) may be computed (Sonon et al., 2012), encoding the 
periodic distance at each point to the k-th nearest particle ®;. The DN% (x) 
level sets may be used in the context of dense packing algorithms and/or 
for generating new microstructures by thresholding suitable level-set 
functions (Sonon et al., 2012; Sonon, 2014; Sonon et al., 2015) 


(DN, (x), DNo(z),..., DN (2) < 0. 


In particular, for the present study, we exploit the Voronoi-type level set 
with interparticle distance t 


DN,(z) — DN2(x) +t < 0, (7.9) 
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for generating microstructures with the complex geometrical features of 
cellular NiAl-Mo. For a given collection of particles, DN; (x)— DNa(x) = 
0 describes the boundary of the associated Voronoi tessellation. Thus, the 
geometry extracted by the related level set (7.9) may be interpreted as an 
expansion of all particles to a shape which enforces a uniform distance 
of t between the resulting cells, see Sec. 4.1 in Sonon et al. (2012). In 
two dimensions, Massart et al. (2018) used the level set (7.9) to generate 
irregular masonry structures featuring elongated inclusions, resembling 
the cells observed in NiAl-10Mo. We follow a similar approach to 
generate the microstructures for the present study: 


1. For given cell dimensions, we use the sequential addition and migra- 
tion (SAM) algorithm (Schneider, 2017b) to pack cylindrical fibers 
with a length of 800 um and a diameter of 1604m until a volume 
fraction of at least 45% is reached. The SAM method has proven to 
be a flexible and powerful scheme for generating dense packings 
of non-overlapping short-fibers with arbitrary prescribed orienta- 
tion state and thus represents our algorithm of choice. However, 
for the simple uni-directional case, the method may be substituted 
by any algorithm which is capable of reaching the desired volume 
fraction. For instance, the LS-RSA method by Sonon et al. (2012) 
may be adapted for elongated inclusions to integrate the level-set 
computation in the packing algorithm. As the level-set operation (7.9) 
further enlarges the inclusions, the fiber dimensions were chosen 20% 
smaller than those observed for the cells in the alloy. The volume 
fraction was chosen to obtain a dense fiber packing, i.e., a roughly 
hexagonal pattern, which still permits some irregularity as observed 
in the actual microstructure. 

2. The level sets DN: (x) and DNa(x) are computed based on the center 
lines of the fibers. For efficiently computing the level sets, we rely 
on the Euclidean distance transform by Meijster et al. (2002), see 
Appendix C for further details. 
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3. Using the bisection method, we iteratively solve for the cell distance 
t until a prescribed cell volume fraction ¢ is obtained. With the 
indicator function 


te) 1, DNi(2) — DN2(x) +t<0, 
ty le) = 
0, otherwise, 


of the level set (7.9), we terminate when the convergence criterion 


fy iv(z)dV 


La 


is satisfied. Throughout we set the tolerance for the volume fraction to 
ö = 107°. Unless stated otherwise, the prescribed volume fraction is 
set to ¢ = 85%, following the estimate of Seemüller et al. (2013). Note 
that we prefer to fix the volume fraction & rather than the interparticle 
distance t, as, from the viewpoint of micromechanics, the volume 
fraction enters the effective (linear elastic) material behavior to first 
order (Milton, 2002, Ch. 14). 


Note that, in practice, the discrete level set is computed on a regular 
background grid. Throughout the present study, we choose the same 
refinement for the level-set computation as for the target resolution of the 
microstructure used in the FFT-based computations. More precisely, for a 
given underlying fiber packing, steps 2 and 3 of the outlined process are 
repeated for each realized resolution. Compared to downsampling all 
realizations from a single finely resolved microstructure, this approach 
requires a larger number of level-set computations. However, it offers 
tighter control of the target volume fraction, which is preferred with 
respect to the minimum necessary resolution for the FFI-based computa- 
tions, see Sec. 7.4.2. The processing steps for a generated microstructure 
with dimensions 4000 um x 800 um x 800 um are visualized in Fig. 7.4. 
Due to the dense fiber packing, the placement and aspect ratio of the cells 
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closely follow that of the underlying fibers. Note that smaller fragments 
visible in Fig. 7.4 arise as artifacts of the 2D cuts and are actually part of 
regularly sized cells. A transverse section and a longitudinal section of 
the generated structure are compared to dark field optical microscopy 
images of cellular samples by Seemiiller et al. (2013) in Fig. 7.5. Both 
the roughly hexagonal cross section of the cells and their elongated 
shape with an aspect ratio about five are featured in the synthetic 
structure. Hence, the volume elements generated by the adapted level- 
set strategy closely resemble cellular NiAl-Mo, enabling subsequent 
micromechanical studies on the materials’ effective creep behavior. 


7.4 Creep behavior of cellular multi-colony 
NiAl-Mo eutectics with degenerated bound- 
ary regions 


7.4.1 Computational setup 


For computing the effective creep response of the NiAl-Mo alloys, we 
rely on an in-house FFT-based micromechanics solver, written in Python 
3.7 with Cython extensions and parallelized using OpenMP. More pre- 
cisely, we use the BFGS-CG algorithm , see Sec. 3.3.4, in combination 
with the staggered grid discretization (Schneider et al., 2016). We refer 
to the recent review by Schneider (2021) for a general overview of 
current FFT-based methods and the articles by Segurado et al. (2018) and 
Lebensohn and Rollett (2020) for dedicated reviews on the computational 
homogenization of polycrystalline materials. For a detailed discussion 
of the specific algorithms used in the study at hand, see Ch. 3. 


? Fig. 7.5b and Fig. 7.5d from Seemüller et al. (2013) are reused under the STM per- 
missions guidelines: https: //www.stm-assoc.org/intellectual-property/ 
permissions/permissions-guidelines/. 
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(a) Transverse section of (b) Microscopy image of transverse 
synthetic structure section 
TE Zu a —— = 
Ps EZ o 
— SS EEE 
on 1000 um | 


(c) Longitudal section of synthetic structure 


growth directi 
— ee 


(d) Microscopy image of longitudal section 


Figure 7.5: Synthetic microstructures in comparison to dark field optical microscopy 
images of NiAl-Mo by Seemüller et al. (2013)? 
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FFT-based solvers naturally operate with periodic boundary conditions, 
i.e., the stress and strain fields in the volume element are periodic. For 
our investigations, we prescribe an effective stress 7, which is the volume 
average of the stress field, of the form 5 = od ® d, corresponding to 
a uniaxial stress state with magnitude o in direction d, see Kabel et al. 
(2016). The loading is applied in 1s and subsequently held constant until 
a steady-state creep rate is reached. For our investigation of the cellular 
material, we restrict to loadings in growth direction. 

Throughout, convergence of the FFT-based solver is checked using the 
criterion proposed in Sec. 5 by Schneider et al. (2019) with a prescribed 
tolerance of 10~*. For the soft regions in the cell-boundary regions, we 
use the material model of NiAl, see Sec. 7.2.1. The behavior of the hard 
regions in the well-aligned cells is governed by the surrogate model 
proposed in Sec. 7.2.3. All computations were either performed on a 
workstation with two 12-core Intel Xeon(R) Gold 6146 CPUs and 512 GB 
RAM or a workstation with two AMD EPYC 7642 with 48 cores each 
and 1024 GB RAM. 


7.4.2 Study on the size of the volume element 


FFT-based micromechanics solvers naturally operate on a regular 
(voxel)grid. However, even when treating the hard regions in cellular 
NiAl-Mo as a homogeneous material, the difference between the largest 
geometric features, i.e., cell lengths of about 1000 um, and the smallest 
geometric features, i.e., the soft cell boundaries with a thickness around 
10 um, is still very large. Both memory and runtime limit the size 
of volume elements which are feasible for computation. Thus, it is 
imperative to identify both a suitable volume element size and an 
appropriate resolution, while keeping the possible error of the material 
response reasonably small (Gote et al., 2022). 

In this context, it is useful to recall some insights from the study on 
representative volume elements by Kanit et al. (2003). When comput- 
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ing an effective material property based on a randomly generated mi- 
crostructure of finite size, Kanit et al. (2003) identify two sources of error. 
For an ensemble of finite microstructure realizations of the same size, 
there will be some scatter in the effective properties of each realization. 
The difference between the effective property of a single realization 
and the mean of an infinitely large ensemble is called dispersion or 
random error. The dispersion can either be reduced by increasing the 
size of the microstructure or by averaging over multiple microstructures. 
The second error source is the bias or systematic error, describing the 
difference between the mean effective properties for a finite volume 
element size and the effective properties of the infinite volume limit. For 
instance, choosing a small volume element may induce anomalies in 
the microstructure leading to incorrect effective properties, independent 
of the number of realizations considered. Indeed, the systematic error 
can only be reduced by increasing the size of the microstructure. As the 
size of the volume element is a limiting factor for the simulations, we 
aim to identify the smallest microstructure which sufficiently reduces 
the systematic error and keep track of the dispersion by considering 
multiple realizations 

In the following, we investigate microstructures with varying lengths 
L and cross-section widths W. For each size, ten volume elements 
are generated and the effective creep rates for a uniaxial stress loading 
of 200 MPa in growth direction are computed. Based on preliminary 
investigations, the voxel size is fixed at 8 um, unless stated otherwise. In 
Fig. 7.6, we plot the resulting mean values together with the two-sided 
99% confidence interval based on Student’s t—distribution, following 
Schneider et al. (2022). Note that for better readability, we use a linear 
scale on the y-axis instead of the typical logarithmic scale when plotting 
experimental creep rates. 

First, we take a look at microstructures of varying width for a fixed length 
of L = 2000 um. We observe that up to a width of 800 um the averaged 
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Figure 7.6: Influence of cell size and resolution on the effective creep rate with default 
values of L = 2000 um, W = 800 um and a default voxel size of 8 um 


creep rate increases linearly and subsequently stagnates, see Fig. 7.6a. 
Indeed, the creep rate for W = 400 um is 30% below the stationary level, 
indicating a large bias. Between W = 800 um and W = 1200 um, the 
fluctuation of the mean creep rates is small compared to the confidence 
intervals, revealing that the dispersion is the primary error source. As 
expected, the confidence intervals narrow down with increasing size. 
However, when considering an ensemble of ten volume elements, a 
width of W = 800 um appears sufficient. 

Qualitatively, the same trends emerge for volume elements of varying 
length, see Fig. 7.6b. Using microstructures with L = 1000 um, i.e., a 
single cell length, leads to a systematic underestimation of the creep 
rate by about 70%. Notably, owing to the imposed regularity of the 
structure (each cell borders itself in length direction), the dispersion is 
comparatively small for this case, demonstrating that bias and dispersion 
do not always follow the same trends. For volume elements longer than 
2000 um, there are only marginal changes in the average creep rates. 
Overall, we conclude that a length of 2000 um, i.e., two cell lengths, is 
sufficient for our purposes, arriving at a default volume element size 
of 2000 um x 800 um x 800 um for our subsequent investigations. We 
emphasize that this choice is only safe if an ensemble of (at least) 10 mi- 
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crostructures is considered. As the dispersion is still rather high, with a 
relative sample standard deviation of 11.5%, using only a single volume 
element may lead to significant (and undetectable) errors (Schneider 
et al., 2022). Further note that these results only hold for investigating 
the effective creep rate. When studying other physical properties, the 
representative volume size has to be identified anew. 

Last but not least, we validate our chosen resolution for our final volume 
element size of L = 2000 um and W = 800 um. To this end, the full en- 
semble of 10 microstructures was discretized with voxel lengths ranging 
from 2 um to 8 um. In comparison to the size of volume element, the 
impact of the resolution is miniscule, see Fig. 7.6c. Note that a resolution 
of 8 um is rather coarse, i.e., the soft cell boundary in the discretized 
microstructure is only one to two voxels in thickness. Hence, the low 
impact of resolution on the overall accuracy may appear surprising. We 
found that a key factor for the consistency of the results with respect 
to resolution stems from in the microstructure generation process, see 
Sec. 7.3. For each sampled resolution, the target volume fraction of 
ġo = 85% was reached to high accuracy by iteratively thresholding the un- 
derlying level-set. Downsampling from a high-resolution microstructure, 
for instance, by using the median value, produces larger scatter in both 
volume fraction and creep rate. Overall, continuing the investigation 
with a default resolution of 8um per voxel seems reasonable. 


7.4.3 On the definition of the soft cell boundary 


In their experimental study on cellular NiAl-Mo, Seemiiller et al. (2013) 
observed a massive loss of creep resistance compared to the well-aligned 
material. More precisely, for a certain nominal stress, the strain-rate 
differed by about two to three orders of magnitude. The magnitude 
of this difference was unexpected, as fiber-free boundary regions only 
accounted for ~ 15% of the total volume and grain boundaries were 
generally found to have no effect on the creep resistance of binary NiAl 
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t~6—10um 


t~ 30 —40um Ü 


(a) Fiber-free (violet) and degenerated (pink) cell (b) Angle of fiber misalignment at cell 
boundary regions boundary 


Figure 7.7: (a) Optical microscopy image by Seemiiller et al. (2013) and SEM image of 
boundary region by Haenschke et al. (2010)? 


(Whittenberger, 1987). Hence, we are interested in computationally 
investigating the loss of creep resistance in the cellular material and 
comparing our results to the experimental data by Seemiiller et al. (2013). 
In this context, we note that the definition of the soft regions and the 
volume fraction of the remaining well-aligned material is crucial. 

For their estimated cell fraction of 82%-85%, Seemiiller et al. (2013) 
only classified completely fiber-free regions as soft regions, see the 
violet shading in Fig. 7.7a. However, larger regions with a coarse 
fiber distribution and pronounced fiber misalignment can be identified 
around the cell boundaries (pink shading in Fig. 7.7a). In light of the 
results in Sec. 7.2.1, it is plausible that the degenerated regions do 
not significantly contribute to the creep resistance in growth direction. 
Indeed, scanning electron microscopy (SEM) images by Haenschke et al. 


9 Fig. 7.7a from Seemüller et al. (2013) is reused under the STM permis- 
sions guidelines: https://www.stm-assoc.org/intellectual-property/ 
permissions/permissions-guidelines/. The shading of the boundary regions 
and the associated annotations have been added. Fig. 7.7b from Haenschke et al. (2010) is 
reused under the CC BY 4.0 license: https://creativecommons.org/licenses/ 
by/4.0/legalcode. The visualization for the angle of misalignment has been added. 
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— 


(a) = 85%,t = (b) = 75%,t = (db = 65%, t = (db = 55%,t = 
llym 19um 28um 38um 


Figure 7.8: Artificial microstructures with varying volume fraction ¢ and corresponding 
boundary width t 


(2010) reveal fiber misalignments between 20° to 30° at cell boundaries, 
see Fig. 7.7b. At these angles of misalignment, a single colony of 
well-aligned NiAl-Mo displays essentially the same creep behavior as 
the pure NiAl matrix. Thus, it appears reasonable to classify both the 
fiber-free and the degenerated regions as soft regions. To check this 
assertion, we consider the simulated creep behavior for varying volume 
fractions of the hard phase, see Fig. 7.8a - Fig. 7.8d for an example 
of a microstructure with varying cell distance t and volume fraction 
¢~. Comparing the computed creep rates to the data by Seemiiller et al. 
(2013) reveals that the experimentally determined creep rates lie between 
the simulation results for volume fractions of ¢ = 55% and ¢ = 65%, see 
Fig. 7.9. The cell distance of 28 wm — 38 um for the associated synthetic 
structures roughly matches the thickness of 30 um — 40 um for the coarse 
region in the microscopy image by Seemiiller et al. (2013). Thus, the creep 
simulations strengthen the hypothesis, that both coarse and fiber-free 
regions should be classified as soft regions. 

Our results highlight that, in contrast to binary NiAl (Whittenberger, 
1987), the boundary of the cellular colonies is essential for explaining the 
overall creep behavior of the cellular material. Owing to its much lower 
creep resistance, properly defining the soft regions and their volume 
fraction is key for reaching accurate predictions. In particular, identifying 
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Figure 7.9: Norton plot for various volume Figure 7.10: Comparison of reinforcing 
fractions and comparison to experimental cell structure between synthetic volume 
data by Seemiiller et al. (2013) elements with & = 85% and ¢ = 55%, 


difference shaded in pink 


the coarse regions with degenerated fiber structure as part of the soft 
regions sheds light on the deterioration of the creep resistance in cellular 
NiAl-Mo samples. Compared to the completely fiber-free regions, the 
degenerated part of the cell boundary occupies two to three times as 
much volume. Hence, the fraction of the actual hard regions is much 
lower than the 85% estimated by Seemiiller et al. (2013), leading to the 
pronounced loss of creep resistance. An illustration of the difference 
in reinforcing structure is shown in Fig. 7.10, where the difference in 
synthetic volume elements with ¢ = 85% and ¢ = 55%, i.e., the impact 
of the coarse boundary, is visualized. Thus, it appears mandatory to 
pay special attention on such mesoscale deviations from the ideal fiber 
morphology when comparing the magnitudes of creep resistance of 
NiAl-based composites from different experimental datasets. As many 
alloys in the NiAl-(Cr,Mo) system exhibit similar colony structures with 
degenerated regions at the cell boundaries (Gombola et al., 2020), these 
findings should be taken into account when modeling and evaluating 
the creep resistance. In particular, more complex alloys with a larger 
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number of constituents will be even more prone to form degenerated 
regions due to extended solidification intervals. 


7.4.4 Influence of the morphology on the creep response 


In terms of the mechanical properties of directionally solidified NiAl- 
Mo, aiming for a well-aligned microstructure appears to be optimal. 
However, this degree of fiber alignment is only achieved under specific 
processing conditions, i.e., slow growth rates and high temperature 
gradients, which are typically restricted to a laboratory environment (Bei 
and George, 2005; Bogner et al., 2012; Hu et al., 2012). In contrast, sam- 
ples solidified in industrial scale furnaces are prone to microstructural 
irregularities (Bogner et al., 2012). Hence, for the practical application of 
NiAl-Mo on a component scale, a robust prediction of the creep behavior 
in terms of the microstructure morphology is required to find a suitable 
compromise between mechanical behavior and favorable processing 
conditions. However, in practice, deliberate morphology modification 
of NiAl-Mo is limited due to strongly interrelating solidification and 
processing parameters. Thus, thoroughly characterizing the impact of 
the morphology on the creep behavior solely based on experiments is 
difficult. 

The level-set framework outlined in Sec. 7.3 provides greater flexibility 
for adjusting the aspect ratio and volume fractions of the generated 
synthetic microstructures. Hence, we expand upon the computations of 
Sec. 7.4.3 and investigate the impact of these morphological quantities 
on the effective creep rate. In addition, we compare our results with 
the Kelly-Street model (Kelly and Street, 1972b), which is popular for 
predicting the microstructure-dependent creep behavior of cellular and 
fibrous composites and evaluating experimental data (Seemiiller et al., 
2013; Hu et al., 2013), to assess its accuracy for the case of NiAl-Mo. 
To this end, we generate microstructures with volume fractions from 
55% — 85% and aspect ratios of 5 — 40. Aspect ratios higher than the 
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original ratio of 5 were realized by increasing the length of the cellular 
inclusions and the overall volume elements as part of the microstructure 
generation routine. Based on the results for l/d = 5, see Sec. 7.4.2, we set 
the width of all volume elements to four times the width of the cellular 
inclusions and the length to twice the cell length. 
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Figure 7.11: Influence of volume fraction ¢ and aspect ratio l/d on the creep rate of cellular 
NiAl-Mo for a fixed stress loading of ø = 100 MPa 


The minimum creep rates obtained from simulations on the generated 
microstructures with a fixed stress loading of ø = 100 MPa are shown in 
Fig. 7.11. Recall that, according to Kanit et al. (2003), the dispersion of 
the effective properties is a decent measure for the representativeness 
of the volume element size. For the cellular microstructures considered 
in this study, this was confirmed in Sec. 7.4.2 for volume elements of 
sufficient length. As a general trend, the dispersion in the effective creep 
rates decreases with increasing aspect ratio, see Fig. 7.11. Hence, it is 
reasonable to assume that the size of the volume elements with an aspect 
ratio beyond the initial choice of l/d = 5 is sufficiently representative as 
well. In Fig. 7.11a, we observe that, for a fixed aspect ratio, a decrease 
in volume fraction by 20% leads to an increase in creep rate by roughly 
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an order of magnitude. This trend is independent of the specific aspect 
ratio, as all plots in Fig. 7.11a feature a similar slope. Similarly, for a 
fixed volume fraction, all plots in Fig. 7.11b exhibit approximately the 
same general tendency. The creep rate decreases by a factor of about 
three from l/d = 5 to l/d = 10. For each subsequent doubling of l/d, the 
impact of the aspect ratio diminishes. The influence of volume fraction 
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Figure 7.12: Influence of volume fraction ¢ and aspect ratio l/d on the apparent stress 
exponent of cellular NiAl-Mo 


& and aspect ratio l/d on the apparent stress exponent is illustrated in 
Fig. 7.12. In particular, all plots in Fig. 7.12a feature the same slope, 
revealing that the apparent stress exponent is virtually independent 
of ¢. In contrast, increasing the aspect ratio leads to a marked change 
from matrix-controlled creep with m ~ 6 for l/d = 5 to fiber-controlled 
creep with m ~ 9 for l/d = 40, see Fig. 7.12b. Due to the change in the 
apparent stress exponent, the impact of the aspect ratio on the creep rate 
diminishes further at higher stresses. Note that the observed values for 
m are inside the range reported in the experimental literature (Dudova 
et al., 2011; Hu et al., 2013; Seemiiller et al., 2013; Albiez et al., 2016a). 
Hence the morphology of the colonies arises as another possible source 
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for the scatter in experimental data, again emphasizing that information 
on the mesostructural properties are crucial for a proper assessment of 
data from different sources. 
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Figure 7.13: Comparison of the quasi-rigid Kelly-Street model (Kelly and Street, 1972b) to 
simulation results for ø = 100MPa 


Lastly, we turn to the comparison of the simulations to the 1-dimensional 
shear-lag model by Kelly and Street (1972b), widely used in materials 
science to assess and interpret experimental creep data of composites 
(Chan, 2002; Hu et al., 2013; Seemüller et al., 2013). In particular, the 
Kelly-Street model for quasi-rigid inclusions admits a closed-form ex- 
pression for the creep rate of the composite as a function of the applied 
stress. Assuming a power-law formulation 


‘ : i oO m 
è = ematrix (=) (7.10) 
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for the matrix, the creep rate for the composite reads 


[02 
Ea (D/d) m/m — 1) 0) + z= 


é= a 


(7.11) 


with the stress transfer function 


2 Ei m m T —1/m 
e a 2/30 1) » (7.12) 


see Sec. 3.1 in Kelly and Street (1972b) and the modifications by Chan 
(2002). Note that, in this context, all considered quantities, such as stress 


o and strain rate &, are scalar valued. In addition to the Kelly-Street 
model, we consider the rule of mixtures as a lower bound on the creep 
rate 


= e 1/m m 1/n 
Oe (1 = $) oo" = (==) g $ To u (z=) ? (7.13) 
0 0 


where it is assumed that the fibers are governed by a power-law, analo- 
gously to (7.10). Note that the rule of mixtures admits no closed-form 
solution for the strain rate & and has to be solved numerically for given 
stress o. The material parameters of matrix and well-aligned colonies 
for the analytical models are listed in Tab. 7.3. 


Table 7.3: Parameters for the 1-dimensional power-law model 


Soft phase matx = ]/s of ?"*=503MPa m=5.8 
Hard phase fbe —1/s gfb —1245MPa n=10 


In Fig. 7.13a, we compare the dependency of the creep rate on the cell 
volume fraction for the original aspect ratio of l/d = 5. As an additional 
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data point, we consider the creep rate of the well-aligned material for 
@ = 100%. For volume fractions smaller than 85%, the plots of the 
analytical models and the simulations have a similar slope. However, the 
Kelly-Street model overestimates the effective creep rate by an order of 
magnitude compared to the simulation results, which lie roughly at the 
geometric mean between the Kelly-Street model and the rule of mixtures. 
In addition, the creep rate for the Kelly-Street model degenerates at 
= n/2V3, i.e., the maximum volume fraction for a hexagonal packing 
of continuous fibers as assumed by Kelly and Street (1972b). The results 
highlight that using the Kelly-Street model beyond its intended regime 
may lead to inaccurate predictions. Indeed, Kelly and Street (1972b) note 
that their theory may be inaccurate for small //d and validate their model 
for l/d = 50 and l/d = 100 (Kelly and Street, 1972a). Keeping in mind 
that the Kelly-Street model assumes a constant strain rate in the matrix 
and zero strain rate in the fibers, the origins of the model inaccuracy 
may be traced to the heterogeneity of the local fields. In Fig. 7.14 the 
strain rate in growth direction is visualized for l/d = 5. Note that, for 
the purpose of portraying the fields, we choose a higher resolution of 
4 um per voxel. 


(a) Boundary network (b) Cellular inclusions 


Figure 7.14: Strain rate component in growth direction for a microstructure with aspect 
ratio l/d = 5 and volume fraction ¢ = 65% 
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Figure 7.15: Histograms of the strain-rate in growth direction for various aspect ratios 


Evidently, the strain rate in both cellular inclusions and matrix is strongly 
heterogeneous for this case, see Fig. 7.15a for the corresponding his- 
togram. Thus, it is not surprising that the Kelly-Street model struggles to 
arrive at accurate predictions. With increasing aspect ratio, the strain-rate 
field becomes more homogeneous, see Fig. 7.15a-Fig. 7.15d, and the sim- 
ulated creep rates approach the results for the rule of mixtures. However, 
for high l/d, the assumption of zero strain-rate in the fibers leads to a 
vast underestimation of the effective creep rate of the composite by the 
Kelly-Street model, see Fig. 7.13b. Thus, we conclude that the model 
should be confined to cases where the inclusions are truly rigid. 


7.5 Conclusions 


The present work was devoted to studying the creep behavior of direc- 
tionally solidified NiAl-Mo eutectics with a cellular mesostructure using 
FFT-based micromechanics solvers. Our conclusions are as following: 

e Combining the level level-set framework for microstructure genera- 
tion (Sonon et al., 2012; Sonon, 2014; Sonon et al., 2015) with FFT-based 
solvers (Moulinec and Suquet, 1998) proves to be a flexible approach 
for simulating the creep response of cellular materials. In particular, 
the suggested procedure enables the individual control of morphologi- 
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cal parameters such as cell volume fraction and aspect ratio. As alloys 
with a larger number of constituents in the NiAl-(Cr,Mo) system may 
be even more prone to developing microstructural irregularities, a 
flexible simulation tool-set is crucial for assessing their creep response. 

e Simulations on both well-aligned and cellular material strongly sug- 
gest that the degenerated regions with high fiber misalignment do not 
substantially contribute to the overall creep strength of NiAl-Mo. As 
a result, the identified fraction of the hard regions was significantly 
lower than first estimated by Seemiiller et al. (2013). This offers an 
explanation for the rather large decrease in creep resistance compared 
to the well-aligned material, which was found to be surprising at that 
time. 

e Studying the impact of morphology on the creep behavior of cellular 
NiAl-Mo, we observed that the volume fraction of the hard regions 
has a strong influence on the (minimum) creep rate, irrespective of the 
aspect ratio of the cells. The aspect ratio primarily determines the ap- 
parent stress exponent, i.e., if the creep behavior is matrix-controlled 
or fiber-controlled. Hence, information on the mesostructure is crucial 
for comparing experimental creep data from different sources. 

e In contrast to Seemiiller et al. (2013), we found that the shear-lag 
model by Kelly and Street (1972b) for quasi-rigid fibers was not able 
to accurately describe the creep behavior of cellular NiAl-Mo. The 
finite creep resistance of the inclusions, their relatively low aspect 
ratio and the resulting inhomogeneity of the microscopic strain-rate 
field were identified as main error sources. Furthermore, based on its 
geometric assumptions, the Kelly-Street model breaks down for cell 
volume fractions above 90%. Overall, the results demonstrate that 
the basic assumptions and scope of the model need to be carefully 
considered, when it is used for interpreting experimental data. The 
Kelly-Street model for creeping fibers (Kelly and Street, 1972b, Sec. 3.2) 
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may serve as a starting point for developing analytical models which 
address the aforementioned limitations. 
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Chapter 8 


Summary and Conclusions 


In the present thesis, we investigated and developed high-performance 
FFT-based micromechanics solvers for efficiently computing the effective 
(thermo)mechanical response of applied materials. For one, we were 
interested in finding powerful general-purpose solvers, which perform 
well for a wide variety of problem classes, including nonlinear material 
behavior, infinite material contrast and computationally expensive mate- 
rial laws. Secondly, we developed dedicated algorithms for specialized 
applications such as crystal plasticity or thermomechanically coupled 
materials. All methods were tested for microstructures of industrial 
size and complexity. In particular, directionally solidified eutectics 
of the NiAl-(Cr, Mo) system, which are subject to active research as 
next-generation high-temperature materials, served as our primary 
material class of interest. Owing to their microstructure, with features 
encompassing multiple length scales, and the computationally expensive 
elasto-viscoplastic material behavior with strain-softening, the microme- 
chanical characterization of these materials represented a research topic 
of interest in and of itself, in addition to being a challenging benchmark 
for the investigated solvers. In the following, we list the main insights 
of each chapter, before closing with some concluding remarks. 
Chapter 3 
e Among the Lippmann-Schwinger solvers, the Barzilai-Borwein 
method (Barzilai and Borwein, 1988; Schneider, 2019a) emerges as 
the solver of choice for computationally cheap material laws, due to 
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its minimal computational overhead. For computationally expensive 
materials, inexact (Quasi-)Newton methods are the preferred choice, 
as they lead to the lowest number of gradient evaluations. 

The BFGS-CG method, approximating the local tangent-stiffness, 
represents a viable alternative to the classical Newton-CG method 
(Gélébart and Mondon-Cancel, 2013; Kabel et al., 2014). This is mainly 
due to the influence of the forcing term choice, i.e., the accuracy for 
solving the linear system. In general, solving to high-accuracy is 
suboptimal with respect to the total computation time. When solving 
the linear system to lower accuracy, having access to the exact tangent 
does not substantially improve the convergence rate. 


Chapter 4 


By establishing the equivalence of the primal and dual variational 
principle for the cell problem with arbitrary boundary conditions (Ka- 
bel et al., 2016), all Lippmann-Schwinger solvers may be formulated 
in terms of the stress as primary field (Bhattacharya and Suquet, 2005). 
For certain small-strain crystal plasticity formulations, the dual frame- 
work proves to be beneficial as the inverse material law, i.e., mapping 
the stress to the strain, is cheaper to evaluate. This is rooted in the 
stress-explicit formulation of the flow rule. 

As most of the computation time in crystal-plasticity simulations is 
spent evaluating the material law, the dual FFT-based solvers are 
about one order of magnitude faster than their primal counterparts. 
In particular, the competitiveness of the Barzilai-Borwein method is 
improved. 


Chapter 5 


The asymptotic homogenization framework by Chatzigeorgiou et al. 
(2016) establishes that only the macroscopic temperature enters the 
cell problem on the microscale, effectively decoupling mechanics 
and heat conduction. Thus, for computing the effective response of 
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thermomechanically coupled materials, it is sufficient to solve a scalar 
equation in addition to the balance of linear momentum. 

Our proposed implicit staggered approach minimizes the additional 
effort for the mechanical solver and preserves the power of FFT-based 
methods, even for strongly coupled problems. The Barzilai-Borwein 
method emerges as a decent choice, as its convergence behavior is 
virtually unaffected. For the Newton-CG method, using an adaptive 
forcing term choice is crucial to compensate for the higher number of 
Newton iterations. 


Chapter 6 


Using Anderson acceleration in combination with polarization-based 
schemes eliminates their sensitivity with respect to the choice of 
algorithmic parameters. More precisely, the user is relieved of fine- 
tuning the damping parameter and has access to more conservative 
step sizes. 

This considerably broadens the range of applications for polarization- 
based schemes, upgrading them to competitive general-purpose 
solvers. In particular, the developed A2DR algorithm compares well 
to various Lippmann-Schwinger solvers in their "ideal" problem 
setting. 


Chapter 7 


The transversely isotropic flow rule based on Naumenko and Al- 
tenbach (2005) captures the creep behavior of well-aligned NiAl-Mo 
with high accuracy. In particular, the decrease in creep resistance for 
off-angle loadings and the change of fiber to matrix-dominated stress 
exponent is faithfully reproduced. 

The level-set approach by Sonon et al. (2012; 2015) offers a flexible 
framework for fine tuning the aspect ratio, volume fraction and 
boundary thickness of synthetic cell structures. 
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Using the computational power of the developed FFT-solvers in com- 
bination with a statistical RVE approach (Kanit et al., 2003; Schneider 
et al., 2022) provides great flexibility for investigating the creep behav- 
ior of cellular NiAl-Mo under different morphological configurations. 


Notably, the influence of cell volume fraction and aspect ratio are 
virtually independent. The former has a larger impact on the creep 
resistance, whereas the latter controls the apparent stress exponent of 
the material. 


Regions containing degenerated and misaligned fibers have a similar 
creep response compared to binary NiAl and do not reinforce the 
overall material. Thus, the fraction of the hard region is lower than 
estimated in earlier studies (Seemiiller et al., 2013). 


The sensitivity of the creep response to the loading direction and 
the morphology on the mesoscale sheds light on the high scatter of 
material parameters in earlier creep experiments. For a meaningful 
comparison of experimental studies, information on the morphology 
of the respective mesostructures is required. 

Considering the development of FFT-based methods, both Lippmann- 
Schwinger and polarization-based approaches have produced competi- 
tive general-purpose solvers, each with distinctive advantages and dis- 
advantages. In terms of raw performance, polarization-based schemes 
often have the upper hand, whenever applicable. However, even with 
the complexity-reduction approach of Schneider et al. (2019), the reliance 
on a polarization field and the application of the nonlinear Z° operator 
may seem unfamiliar to users of classical displacement based mechanics 
solvers. In particular, auxiliary techniques and interfaces of practical 
relevance, such as thermomechanical coupling (see Ch. 5), composite 
voxels (Kabel et al., 2017) or UMAT support, are usually formulated in 
terms of strains or displacements. Hence, establishing compatibility to 
polarization-based methods is not straightforward and requires addi- 
tional implementation effort. On the other hand, Lippmann-Schwinger 
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solvers are usually easier to implement and maintain, whereas, for 
optimal performance, judiciously choosing the right algorithm for the 
problem at hand is still unavoidable. Thus, the "ideal" solver, combining 
fast performance, flexibility and memory efficiency is yet to be devel- 
oped. However, inexact (Quasi-)Newton methods, the Barzilai-Borwein 
method and (Anderson accelerated) polarization schemes, never stray 
too far from each other and are all worthy of a recommendation. 

With respect to further investigations of NiAl-(Cr, Mo) eutectics, the 
methods of Ch. 7 may be used to characterize the various types mi- 
crostructures, i.e., fibrous or lamellar colonies, arising at different com- 
positions of the refractory metals. In particular, investigating the dif- 
ferent mechanical responses for varying microstructures is crucial for 
identifying promising material compositions. Furthermore, transferring 
the results of the micromechanical investigations to the macroscale is 
of high interest. In this context, the anisotropic creep behavior of the 
directionally solidified eutectics may prove to be detrimental when 
subjected to the multiaxial stress-states encountered in real-world com- 
ponents. To facilitate such studies, either phenomenological surrogate 
models or data-driven approaches Dvorak and Benveniste (1992); Michel 
and Suquet (2003); Gajek et al. (2020) may be informed by FFT-based 
computations. 
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The Helmholtz decomposition for 
elasticity 


The Helmholtz decomposition is discussed, for instance, in Ch. 12.1 of 
Milton (2002). To establish consistency and for the convenience of the 
reader, it is introduced in this appendix for the case of elasticity. Let the 
C°-weighted inner product on V = L?(Y;Sym(d)) be defined as 


(S,T)co = vl S:C°:Tdz, S,TeV. (A.1) 
IY] Jy 
Then the operators 
(jy, T®:C°, and A°=I-()y -T°:C%, (A.2) 


with T° = V*(div C°V*)~!div form a complete set of complementary 
orthogonal projectors. They induce an orthogonal direct sum decompo- 
sition of V 

V = im (.), @imI®: C sim A’ (A.3) 


with the subspaces 
im ()y ={SeV|S= (S)y}, 


iml’’:C°={SEeV|S=V'u, we Hy(¥;Sym(d)), (S)y =O}, 
im A’ = {S € V |div [C° : S]=0, (S)y =0}. 


A The Helmholtz decomposition for elasticity 


Hence, any S € V can be decomposed into three components 
S= (jy +I®: CŒ: S Ars (A.5) 


where (S)y is constant, T° : CP: S = V*u is mean-free and compatible, 
and AP: S = $ — (S)y — V*u is mean-free and divergence-free. 
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The dual potential of the 
stress-based variational 
framework 


In the following, we discuss the derivation of the dual potential W* 
which was introduced ad-hoc in Ch. 4.4 of the main text. Let f : X > R 
be a convex function on the Banach space X. Let f* : X’ — R be its 
Legendre transform 


f*(y) = sup((x,y) — f(z)) (B.1) 


zEeX 


where (-,-) is the natural pairing (-,-) : X x X’ > R. If fand f* are Ct 
then 
y=Df(x) iff «c= Df*(y). (B.2) 


Suppose the closed subset U C X is a convex cone, i.e. 
61%, + 09% EU forall zı, z2 €U and 61, 6 >0, (B.3) 
and let U* C X* be its dual cone 


U* ={yeX*|(y,2) >0 Veen. (B.4) 
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Then, according to Theorem 31.4 in Sec. 31 of Rockafellar (1970), 


min f(x) = — u FW) (B.5) 


LE 


holds. 

In the context of X = L?(Y;Sym(d)), X’ can be identified with X via 
the Riesz map and (-,-) can be identified with (-,-)z2. Consider the 
minimization problem 


min f(x) (B.6) 
with the objective function 
f(@) = (wE +ê) -7 : êjy , (B.7) 
see (4.7), and 
U= {êe X|è= (êjy +Vĉu, ueHy(Y;Sym(d)), P: (@)y =0}. 


(B.8) 
U is a closed subspace of X and therefore a convex cone, see Boyd and 
Vandenberghe (2004). Hence, its dual cone U* is equal to its annihilator 


U°={6EX|dive=0, Q: (6), =0}, (B.9) 


so that 
(ô, êr: =0, forall G€U° and ZEV. (B.10) 


The Legendre transform of f reads 


Phe) = suplia E = tE Fey) 
= BUD Mose) le (B.11) 


(w*(o))y 


= (w*(o))y - (Ay: 
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with 


w*(o) =sup(o:e-w(e)) and e=E+E. (B.12) 
EEX 


In the main text, we denote W (ê) = f(ê) and W*(ô) = f*(@+ ô) with 
o = F +6. With this choice of W* (ô) we obtain 


min W (ê) = - min W*(6). (B.13) 


by (B.5). Hence, the equality of the primal and dual variational problems 
in Section 4.2.2 and Section 4.4. 
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Remarks on computing the DN; 
level sets on voxel images 


C.1 Using Euclidean distance transforms to 
compute level sets 


In the following, we briefly lay out how Euclidean distance transforms 
(EDTs), which are standard algorithms in image processing (Fabbri et al., 
2008), may be exploited for computing the DNx, level-sets on voxel 
images. Suppose a binary microstructure image Z : Q — {0,1} is given 
on a discretized domain 2 = {0,..., N }?, with a set of inclusion voxels 
® (typically with value 1) and matrix voxels 6° = N\® (typically with 
value 0). For any image with a marked set of object voxels O, an EDT 
assigns to each voxel the distance to its nearest object voxel. Thus, the 
signed distance field DN, of Z may be computed by a three-step process: 
1. Identify the boundary voxels 0® of the inclusions (Gonzalez and 
Woods, 2018, Sec 9.5). For our implementation, we check the connec- 
tivity based on the 6-neighbourhood, i.e., voxels are connected if they 
share a face. 
2. Compute the EDT with the boundary 0® as object ©. 
3. Assign a negative sign to the distance for all voxels inside ®. 


For computing the nearest neighbour level sets DN, of higher order, the 
sequential updating strategy by (Sonon, 2014, Sec. 2.4.1) may be used. 
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To this end, the subsets ®; of all particles, with ® = J}_, ®;, have to 
be identified in a pre-processing step, using a connected-component 
extraction algorithm (Gonzalez and Woods, 2018, Sec 9.5). For each 
particle, firstly, its signed distance field DSa, is computed by applying 
the outlined three-step process to ®;. Secondly, the DN; fields are 
updated according to (Sonon, 2014, Sec. 2.4.1) 


DN; + max(DN;,_ı, min(DN,, DS»,)) 


(C.1) 
DN, + min(DN\,, DSe,), 


starting with kmax, i.e., the highest desired value of k. Note, that the 
computational effort for this generic strategy is proportional to the 
number of particles in the image Z. However, certain EDTs may be 
modified to evaluate DN; in a single pass as shown in the next section. 


C.2 Choice of EDT algorithm 


For an extensive performance comparison and discussion of various 
EDTs for 2-dimensional images, we refer to the study by Fabbri et al. 
(2008). Following their taxonomy, EDTs may be broadly categorized into 
scanning algorithms and propagating algorithms (Fabbri et al., 2008), 
differing in the order in which the voxels are processed. For the present 
discussion, we consider one representative algorithm of each family. 

In scanning algorithms, the image is processed in terms of its rows, 
columns and planes. The fastest EDT in this category (Fabbri et al., 
2008) is the algorithm by Meijster et al. (2002), which exploits that the 
minimization problem for computing the square Euclidean distance 
transform may be solved for each spatial dimension separately. The 
algorithm lends itself well to parallelization and periodicity of the image 
can be incorporated at virtually no additional cost, using the scheme of 
Coeurjolly (2008). 
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Figure C.1: Overview of the microstructure and the DN (red scale) and DN? (blue scale) 
level sets for a structure with 500 fibers 


Propagating algorithms update the distance field in a narrow band or 
wavefront, emanating from the object voxels. As Dijkstra-type algo- 
rithms they are very similar to the fast marching method for solving 
the (related but more general) eikonal equation. Algorithms of this 
type (Ragnemalm, 1992; Cuisenaire and Macq, 1999; Lotufo et al., 2000) 
mostly differ in details such as the data structure for the wavefront or the 
propagated information, see Sec. 7.4.1. in Fabbri et al. (2008) for a generic 
description. For our implementation, we choose the algorithm by Lotufo 
et al. (2000) using a bucket queue as data structure and propagating the 
nearest object voxel. The bucket queue enables a partial parallelization 
of the algorithm. Periodicity is integrated by considering the periodic 
6-neighbourhood during propagation. Note that propagation-type al- 
gorithms are generally not exact, see Cuisenaire and Macq (1999) for a 
detailed discussion of the 2-dimensional case. However, in our studies 
the maximum error for computing DN; was usually below a single 
voxel length, which we consider acceptable. 
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Figure C.2: Benchmark of the scanning EDT by Meijster et al. (2002) and the propagating 
EDT by Lotufo et al. (2000) (and its modified version Alg. 8) for computing the nearest 
neighbour level sets of isotropically packed fiber structures 


As a first benchmark, we investigate the performance of the algorithms 
for computing the DN, level set of a microstructure generated with 
the SAM algorithm by Schneider (2017b), featuring 500 isotropically 
distributed fibers with an aspect ratio of 10, occupying a volume fraction 
of 23.5%, see Fig. C.1. All EDT benchmarks were performed on a desktop 
computer with an Intel i7-8700K CPU using 6 threads. The runtimes 
for different spatial discretizations from 64° voxels up to 512° voxels 
are shown in Fig. C.2a. For the chosen structure, both EDT algorithms 
exhibit linear time complexity with respect to the voxel count. However, 
the scanning algorithm is more than an order of magnitude faster than 
the propagating algorithm, confirming the trends observed by Fabbri 
et al. (2008) in the 3-dimensional setting. 

The vast difference in performance may suggest that this is the end 
of the story and the scanning algorithm by Meijster et al. (2002) is 
clearly superior. However, the situation changes when considering 
level sets DN, of higher order. As far as the authors are aware, the 
scanning algorithm is limited to the sequential updating strategy (C.1) 
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outlined in the last section. Thus, the runtime for computing the D Nj, 

level sets becomes dependent on the number of inclusions. This is 

illustrated in Fig. C.2b, where the runtimes for computing DN; and 

DN; for microstructures with a fixed voxel count of 256° and a varying 

number of packed fibers are plotted. 

On the other hand, the propagating algorithm by Lotufo et al. (2000) can 

be naturally modified to compute the DNx level sets up to a maximum 

level kmax in a single pass. A pseudocode for the modified algorithm 
is outlined in Alg. 8. Informally speaking, a unique label is assigned 
to each object and the associated emanating wavefront. By allowing 
wavefronts with different labels to pass through each other, the order 
of arrival at a certain coordinate determines the level k of DN;. At 
the end, all points are visited kmax times. Note that a similar concept 

for fast marching algorithms was outlined in (Sonon, 2014, Sec. 2.4.1). 

The performance of the resulting single-pass propagating algorithm is 

virtually independent of the number of inclusions, see Fig. C.2b. In 

particular, it becomes the preferable option for object counts larger than 

100. At the end, some closing remarks are in order: 

1. Due to the limited size of feasible volume elements for cellular NiAl- 
Mo, see Sec. 7.4.2, we did not exceed fiber counts of 100 during the 
microstructure generation process. Thus, the scanning algorithm of 
Meijster et al. (2002) was used for the present study. However, the 
propagating scheme in Alg. 8 is more suitable as a general-purpose 
method. 

2. The sequential scanning algorithm may compute the level sets for 
higher kmax at little additional cost, as the updating step (C.1) is 
usually less expensive than computing the level set of a single parti- 
cle. On the other hand, the computational effort for the single-pass 
propagating algorithm increases notably, as more voxels need to be 
processed. Thus, at higher kmax the break-even point in terms of 
inclusion count may shift to higher numbers in favor of the scanning 
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algorithm. However, for the most common morphing operations 
(Sonon et al., 2012; Sonon, 2014; Sonon et al., 2015) the necessary kmax 
does not exceed 3. Hence, our evaluation of the algorithms is not 
substantially affected. 


. If a sequential evaluation of the level sets is unavoidable, e.g., as 


part of the LS-RSA microstructure generation process by Sonon et al. 
(2012), then Meijster’s algorithm (Meijster et al., 2002) is the method of 
choice. Due to its high efficiency for computing the level set of a single 
particle, it relieves the user of using pre-screening strategies (Sonon 
et al., 2012). In addition, using Coeurjolly’s approach (Coeurjolly, 
2008) avoids the creation and consideration of periodic neighbours. 


Algorithm 8 Propagation algorithm for computing DN; in a single pass 


Auxiliaries: 


A voxel object v stores its associated coordinates v.x, a label v.label, a 
root voxel v.root and the periodic square distance to its root v.dsquare 
Q is a queue for storing voxels, ordered by their square distance 
Diabelea(X) is an array with the same structure as Z but the coordinates 
of each object ®; are marked with a unique integer label 

N(x) returns the voxels with coordinates of the periodic neighbour- 
hood of point « 

V (a) is an array, storing the number of wavefronts, which have passed 
point x 

DN(x,k) is an array, storing the value of the DN; level set at point x 
L(x, k) is an array storing the label of the kth wavefront which has 
passed x 
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Propagation algorithm for computing DN; ina single pass (continued) 


Input: Binary image Z, maximum depth kmax 
Output: Level sets DN (x, k) 


1: Initialize V, DN and L to 0 

2: Extract particles ®;, assign an integer label > 1 to each particle and 
initialize Tiapelea(x) 

3: Identify the particle boundaries 06; 

4: Add all points in 0®; with their associated label and themselves as 
root to Q 

5: while Q is not empty do 

6 Remove voxel v with smallest v.dsquare from Q 

7: if V(v.x) < kmax and v.label ¢ {L(v.x,0),..., L(v.x, kmax)} then 

8: L(v.2,V(v.x)) + v.label 

9 d+ Vv.dsquare 


10: if v.label = Tiabelea(v-2) then 

11: d+ —d 

12: end if 

13: DN (v.x, V (v.x)) — d 

14: V(v.2) + V(v.z)+1 

15: for each n € N(v.x) do 

16: if V(n.x) < kmax and v.label ¢ 
{L(n.x,0), ..., L(n.x, kmax)} then 

17: n.root <— v.root 

18: n.label + v.label 

19: n.dsquare + ||n.x — n.root.x||? 

20: AddntoQ 

21: end if 

22: end for 

23: end if 


24: end while 
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